Running 'make check' hangs for ever

Isaac Boukris iboukris at gmail.com
Sat Aug 13 12:28:23 EDT 2016


On Fri, Jan 1, 2016 at 12:06 AM, Isaac Boukris <iboukris at gmail.com> wrote:
> On Thu, Dec 31, 2015 at 2:43 AM, Isaac Boukris <iboukris at gmail.com> wrote:
>> On Thu, Dec 31, 2015 at 1:08 AM, Greg Hudson <ghudson at mit.edu> wrote:
>>> On 12/30/2015 04:28 PM, Isaac Boukris wrote:
>>>> [pid 21891] fcntl64(6</home/admin/git/krb5/src/lib/krb5/ccache/testdir/db.kadm5.lock>,
>>>> F_OFD_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0,
>>>> l_len=-5230051357888610304}) = -1 EINVAL (Invalid argument)
>>>> [pid 21891] fcntl64(6</home/admin/git/krb5/src/lib/krb5/ccache/testdir/db.kadm5.lock>,
>>>> F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}
>>>
>>> Unfortunately, I can't really tell what's going on.  In ofdlock(), we
>>> pass the same struct flock pointer to both fcntl() invocations when we
>>> fall back to F_SETLKW, so I don't know why the first invocation is
>>> reported a garbage l_len.  I also don't know why the second invocation
>>> is blocking; did the first invocation somehow obtain a lock despite
>>> returning EINVAL?  I can't find any search results about a known kernel
>>> or glibc bug which might explain this odd behavior.
>>
>>
>> Strange indeed, it does look like the fd is locked:
>> # cat /proc/24040/fdinfo/6
>> pos:    0
>> flags:  02000002
>> mnt_id: 57
>> lock:   1: POSIX  ADVISORY  READ  24040 fd:00:1195813 0 EOF
>>
>> I've pasted a longer output of strace at:
>> http://pastebin.com/Rw8nvjCZ
>
>
> FYI I think there is something wrong with my system.
> I tried to investigate by starting with an example from:
> https://www.gnu.org/software/libc/manual/html_node/Open-File-Description-Locks-Example.html
>
> But it get stuck in a similar manner:
>
> [pid  6683] fcntl64(4</tmp/foo>, F_OFD_SETLK, {l_type=F_UNLCK,
> l_whence=SEEK_SET, l_start=4294967296, l_len=8367752102667091968}) =
> -1 EINVAL (Invalid argument)
> [pid  6683] nanosleep({0, 1000},  <unfinished ...>
> [pid  6682] <... open resumed> )        = 5</tmp/foo>
> [pid  6683] <... nanosleep resumed> NULL) = 0
> [pid  6683] madvise(0xb6571000, 8372224, MADV_DONTNEED) = 0
> [pid  6683] exit(0)                     = ?
> [pid  6683] +++ exited with 0 +++
> [pid  6682] fcntl64(5</tmp/foo>, F_OFD_SETLKW, {l_type=F_WRLCK,
> l_whence=SEEK_SET, l_start=4294967296, l_len=0}
>
> Thanks for the feedback and happy new year!


FYI 2, I've hit this again and looked a little further.
It looks like the bug is in glibc on 32 arch, see:
https://sourceware.org/bugzilla/show_bug.cgi?id=20251

Changing 'struct flock' to 'struct flock64' in 'lock_file.c' also works.

Regards.


More information about the Kerberos mailing list