lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070820225415.GL3956@digitalkingdom.org>
Date:	Mon, 20 Aug 2007 15:54:15 -0700
From:	Robin Lee Powell <rlpowell@...italkingdom.org>
To:	linux-kernel@...r.kernel.org
Subject: NFS hang + umount -f: better behaviour requested.

(cc's to me appreciated)

It would be really, really nice if "umount -f" against a hung NFS
mount actually worked on Linux.  As much as I hate Solaris, I
consider it the gold standard in this case: If I say
"umount -f /mount/that/is/hung" it just goes away, immediately, and
anything still trying to use it dies (with EIO, I'm told).

If I know the NFS server is down, that really is the correct
behaviour.  I very much want this behaviour, and am willing to
bribe/pay for it, although my resources are limited.

Unless you're interested in details of my tests, stop here.

I'm bringing this up again (I know it's been mentioned here before)
because I had been told that NFS support had gotten better in Linux
recently, so I have been (for my $dayjob) testing the behaviour of
NFS (autofs NFS, specifically) under Linux with hard,intr and using
iptables to simulate a hang.  fuser hangs, as far as I can tell
indefinately, as does lsof. umount -f returns after a long time with
"busy", umount -l works after a long time but leaves the system in a
very unfortunate state such that I have to kill things by hand and
manually edit /etc/mtab to get autofs to work again.

The "correct solution" to this situation according to
http://nfs.sourceforge.net/ is cycles of "kill processes" and
"umount -f".  This has two problems:  1.  It sucks.  2.  If fuser
and lsof both hand (and they do: fuser has been on
"stat("/home/rpowell/"," for > 30 minutes now), I have no way to
pick which processes to kill.

I've read every man page I could find, and the only nfs option that
semes even vaguely helpful is "soft", but everything that mentions
"soft" also says to never use it.

This is the single worst aspect of adminning a Linux system that I,
as a carreer sysadmin, have to deal with.  In fact, it's really the
only one I even dislike. At my current work place, we've lost
multiple person-days to this issue, having to go around and reboot
every Linux box that was hanging off a down NFS server.

I know many other admins who also really want Solaris style
"umount -f"; I'm sure if I passed the hat I could get a decent
bounty together for this feature; let me know if you're interested.

Thanks.

-Robin

-- 
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ