[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070820225415.GL3956@digitalkingdom.org>
Date: Mon, 20 Aug 2007 15:54:15 -0700
From: Robin Lee Powell <rlpowell@...italkingdom.org>
To: linux-kernel@...r.kernel.org
Subject: NFS hang + umount -f: better behaviour requested.
(cc's to me appreciated)
It would be really, really nice if "umount -f" against a hung NFS
mount actually worked on Linux. As much as I hate Solaris, I
consider it the gold standard in this case: If I say
"umount -f /mount/that/is/hung" it just goes away, immediately, and
anything still trying to use it dies (with EIO, I'm told).
If I know the NFS server is down, that really is the correct
behaviour. I very much want this behaviour, and am willing to
bribe/pay for it, although my resources are limited.
Unless you're interested in details of my tests, stop here.
I'm bringing this up again (I know it's been mentioned here before)
because I had been told that NFS support had gotten better in Linux
recently, so I have been (for my $dayjob) testing the behaviour of
NFS (autofs NFS, specifically) under Linux with hard,intr and using
iptables to simulate a hang. fuser hangs, as far as I can tell
indefinately, as does lsof. umount -f returns after a long time with
"busy", umount -l works after a long time but leaves the system in a
very unfortunate state such that I have to kill things by hand and
manually edit /etc/mtab to get autofs to work again.
The "correct solution" to this situation according to
http://nfs.sourceforge.net/ is cycles of "kill processes" and
"umount -f". This has two problems: 1. It sucks. 2. If fuser
and lsof both hand (and they do: fuser has been on
"stat("/home/rpowell/"," for > 30 minutes now), I have no way to
pick which processes to kill.
I've read every man page I could find, and the only nfs option that
semes even vaguely helpful is "soft", but everything that mentions
"soft" also says to never use it.
This is the single worst aspect of adminning a Linux system that I,
as a carreer sysadmin, have to deal with. In fact, it's really the
only one I even dislike. At my current work place, we've lost
multiple person-days to this issue, having to go around and reboot
every Linux box that was hanging off a down NFS server.
I know many other admins who also really want Solaris style
"umount -f"; I'm sure if I passed the hat I could get a decent
bounty together for this feature; let me know if you're interested.
Thanks.
-Robin
--
http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Proud Supporter of the Singularity Institute - http://singinst.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists