lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 2 Feb 2007 13:54:31 -0500
From:	"Aaron Wiebe" <epiphani@...ertrip.net>
To:	linux-kernel@...r.kernel.org
Subject: uninterruptable fcntl calls

Greetings,

I've run into a situation where fcntl F_SETLKW calls lock up nearly
completely.  I've tried several approaches to handle this case, and
have yet to come up with some method of handling this.  I've never
really ventured outside userspace, so I'm turning to this list to try
and get a handle on this.

Over NFSv3 udp, this situation takes place VERY rarely, however with
the volume I do, its creating a problem.

In short, I am attempting to read or write lock, and the call hangs to
the point where a sigkill is not captured - no signal is.  I've tried
alarming out and I've tried switching the socket to nonblocking -
nothing I can think of prevents or even allows me to handle the case.
I understand NFS locking can be rather sketchy at times - but all I
need is the ability to handle the case.

I can force the process to die by sending a sigkill, then stracing.
The strace reports the process as sigstop, then processes the kill
signal.

All I need here is a method of capturing this case.  I can "repair"
the stuck lock by regenerating the file, but I can't capture the case
in order to handle this in code.

Any help would be useful - I am currently running 2.6.15.6 compiled
with the NFS patches from linux-nfs.org, but this case was happening
before applying those patches.  I'd be happy to provide any more
information nessecary.  I've been struggling with this one for a few
months now.

Thanks,
-Aaron


Straces:

rt_sigaction(SIGALRM, {0xb7f56640, [ALRM], 0}, {SIG_DFL}, 8) = 0
alarm(120)                              = 0
fcntl64(3, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}
[hangs]

Or:

fcntl64(3, F_GETFL)                     = 0x8002 (flags O_RDWR|O_LARGEFILE)
fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE) = 0
fcntl64(3, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=0, len=0}



Code used for locking:

static int db_lock(int fd, int type)
{
    struct flock fl;
    struct timespec *tv = (struct timespec *) malloc(sizeof(struct timespec));
    int ret, c = 0;

    if(!(fd > 0))
        return -1;

#ifdef SIGALRM_HACK
    /* after two minutes, wig out */
    sigalrm_set();
    alarm(120);
#endif

    fl.l_whence = SEEK_SET;
    fl.l_start = 0;
    fl.l_len = 0;
    fl.l_type = type;

#ifdef NONBLOCKING_HACK
    set_nonblocking(fd);
#endif

    while((ret = fcntl(fd, F_SETLKW, &fl)) < 0)
    {
        c++;
        if(c > 600)
        {
            /* we've been waiting for 60 seconds... */
            my_error("stuck on fcntl request, aborting");
            return -1;
        }
        tv->tv_nsec = 100;   /* 10th of a second wait */
        tv->tv_sec = 0;
        nanosleep(tv, NULL);
    }
    free(tv);
#ifdef SIGALRM_HACK
    sigalrm_unset();
#endif
#ifdef NONBLOCKING_HACK
    unset_nonblocking(fd);
#endif
    return ret;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ