[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1218763945.5291.19.camel@sebastian.kern.oss.ntt.co.jp>
Date: Fri, 15 Aug 2008 10:32:25 +0900
From: Fernando Luis Vázquez Cao
<fernando@....ntt.co.jp>
To: "J. Bruce Fields" <bfields@...ldses.org>
Cc: NAKANO Hiroaki <nakano.hiroaki@....ntt.co.jp>,
Trond.Myklebust@...app.com, neilb@...e.de,
linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH]lockd: fix handling of grace period after long periods
of inactivity
Hi Bruce!
On Thu, 2008-08-14 at 15:06 -0400, J. Bruce Fields wrote:
> On Thu, Aug 14, 2008 at 08:08:16PM +0900, NAKANO Hiroaki wrote:
> > lockd uses time_before() to determine whether the grace period has
> > expired. This would seem to be enough to avoid timer wrap-around issues,
> > but, unfortunately, that is not the case. The time_* family of
> > comparison functions can be safely used to compare jiffies relatively
> > close in time, but they stop working after approximately LONG_MAX/2
> > ticks. nfsd can suffer this problem because the time_before() comparison
> > in lockd() is not performed until the first request comes in, which
> > means that if there is no lockd traffic for more than LONG_MAX/2 ticks
> > we are screwed.
> >
> > The implication of this is that once time_before() starts misbehaving
> > any attempt from a NFS client to execute fcntl() will be received with a
> > NLM_LCK_DENIED_GRACE_PERIOD message for 25 days (assuming HZ=1000). In
> > other words, the 50 seconds grace period could turn into a grace period
> > of 50 days or more.
> >
> > This patch corrects this behavior by implementing grace period with a
> > (retriggerable) timer.
> >
> > Note: This bug was analyzed independently by Oda-san <oda@...inux.co.jp>
> > and myself.
>
> Good catch! Did you actually run across this in practice? I would've
> thought it relatively unusual to have a lockd that didn't receive its
> first lock request until 25 days after startup.
Yes, we did find this problem in production. More often than one would
wish, installing new software in a system that has been running without
a hiccup for weeks or months is the only thing you will need to bring
mayhem.
> I still have a mild preference for a work struct just in case we end up
> wanting to do something slightly more complicated to end the grace
> period, but I don't really have anything in mind.
For simplicity I think we could we get Nakano-san's patch merged first.
If needed, moving to a work-based solution should be relatively easily.
Thank you for you comments!
- Fernando
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists