[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18123.5699.405125.137517@stoffel.org>
Date: Tue, 21 Aug 2007 12:43:47 -0400
From: "John Stoffel" <john@...ffel.org>
To: Robin Lee Powell <rlpowell@...italkingdom.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: NFS hang + umount -f: better behaviour requested.
Robin> I'm bringing this up again (I know it's been mentioned here
Robin> before) because I had been told that NFS support had gotten
Robin> better in Linux recently, so I have been (for my $dayjob)
Robin> testing the behaviour of NFS (autofs NFS, specifically) under
Robin> Linux with hard,intr and using iptables to simulate a hang.
So why are you mouting with hard,intr semantics? At my current
SysAdmin job, we mount everything (solaris included) with 'soft,intr'
and it works well. If an NFS server goes down, clients don't hang for
large periods of time.
Robin> fuser hangs, as far as I can tell indefinately, as does
Robin> lsof. umount -f returns after a long time with "busy", umount
Robin> -l works after a long time but leaves the system in a very
Robin> unfortunate state such that I have to kill things by hand and
Robin> manually edit /etc/mtab to get autofs to work again.
Robin> The "correct solution" to this situation according to
Robin> http://nfs.sourceforge.net/ is cycles of "kill processes" and
Robin> "umount -f". This has two problems: 1. It sucks. 2. If fuser
Robin> and lsof both hand (and they do: fuser has been on
Robin> "stat("/home/rpowell/"," for > 30 minutes now), I have no way to
Robin> pick which processes to kill.
Robin> I've read every man page I could find, and the only nfs option
Robin> that semes even vaguely helpful is "soft", but everything that
Robin> mentions "soft" also says to never use it.
I think the man pages are out of date, or ignoring reality. Try
mounting with soft,intr and see how it works for you. I think you'll
be happy.
Robin> This is the single worst aspect of adminning a Linux system that I,
Robin> as a carreer sysadmin, have to deal with. In fact, it's really the
Robin> only one I even dislike. At my current work place, we've lost
Robin> multiple person-days to this issue, having to go around and reboot
Robin> every Linux box that was hanging off a down NFS server.
Robin> I know many other admins who also really want Solaris style
Robin> "umount -f"; I'm sure if I passed the hat I could get a decent
Robin> bounty together for this feature; let me know if you're interested.
Robin> Thanks.
Robin> -Robin
Robin> --
Robin> http://www.digitalkingdom.org/~rlpowell/ *** http://www.lojban.org/
Robin> Reason #237 To Learn Lojban: "Homonyms: Their Grate!"
Robin> Proud Supporter of the Singularity Institute - http://singinst.org/
Robin> -
Robin> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Robin> the body of a message to majordomo@...r.kernel.org
Robin> More majordomo info at http://vger.kernel.org/majordomo-info.html
Robin> Please read the FAQ at http://www.tux.org/lkml/
Robin> !DSPAM:46ca1d9676791030010506!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists