[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20131030212649.GG6188@dastard>
Date: Thu, 31 Oct 2013 08:26:49 +1100
From: Dave Chinner <david@...morbit.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Russell King - ARM Linux <linux@....linux.org.uk>,
linux-mm <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Dave Chinner <dchinner@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] mm: list_lru: fix almost infinite loop causing effective
livelock
On Wed, Oct 30, 2013 at 12:49:05PM -0700, Linus Torvalds wrote:
> On Wed, Oct 30, 2013 at 7:16 AM, Russell King - ARM Linux
> <linux@....linux.org.uk> wrote:
> >
> > So, if *nr_to_walk was zero when this function was entered, that means
> > we're wanting to operate on (~0UL)+1 objects - which might as well be
> > infinite.
> >
> > Clearly this is not correct behaviour. If we think about the behaviour
> > of this function when *nr_to_walk is 1, then clearly it's wrong - we
> > decrement first and then test for zero - which results in us doing
> > nothing at all. A post-decrement would give the desired behaviour -
> > we'd try to walk one object and one object only if *nr_to_walk were
> > one.
> >
> > It also gives the correct behaviour for zero - we exit at this point.
>
> Good analysis.
>
> HOWEVER.
>
> I actually think even your version is very dangerous, because we pass
> in the *address* to that count, and the only real reason to do that is
> because we might call it in a loop, and we want the function to update
> that count.
>
> And even your version still underflows from 0 to really-large-count.
> It *returns* when underflow happens, but you end up with the counter
> updated to a large value, and then anybody who uses it later would be
> screwed.
>
> See, for example, the inline list_lru_walk() function in <linux/list_lru.h>
>
> So I think we should either change that "unsigned long" to just
> "long", and then check for "<= 0" (like list_lru_walk() already does),
> or we should do
>
> if (!*nr_to_walk)
> break;
> --*nr_to_walk;
>
> to make sure that we never do that underflow.
Yup, I missed that case. Thanks for finding and fixing it.
> I will modify your patch to do the latter, since it's the smaller
> change, but I suspect we should think about making that thing signed.
Yeah, I'll look into it. The shrinker API itself only ever feeds
shrinkctl->batch to it so we shouldn't ever have overflow problems
from that perspective...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists