lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20131030212649.GG6188@dastard>
Date:	Thu, 31 Oct 2013 08:26:49 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Russell King - ARM Linux <linux@....linux.org.uk>,
	linux-mm <linux-mm@...ck.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Dave Chinner <dchinner@...hat.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] mm: list_lru: fix almost infinite loop causing effective
 livelock

On Wed, Oct 30, 2013 at 12:49:05PM -0700, Linus Torvalds wrote:
> On Wed, Oct 30, 2013 at 7:16 AM, Russell King - ARM Linux
> <linux@....linux.org.uk> wrote:
> >
> > So, if *nr_to_walk was zero when this function was entered, that means
> > we're wanting to operate on (~0UL)+1 objects - which might as well be
> > infinite.
> >
> > Clearly this is not correct behaviour.  If we think about the behaviour
> > of this function when *nr_to_walk is 1, then clearly it's wrong - we
> > decrement first and then test for zero - which results in us doing
> > nothing at all.  A post-decrement would give the desired behaviour -
> > we'd try to walk one object and one object only if *nr_to_walk were
> > one.
> >
> > It also gives the correct behaviour for zero - we exit at this point.
> 
> Good analysis.
> 
> HOWEVER.
> 
> I actually think even your version is very dangerous, because we pass
> in the *address* to that count, and the only real reason to do that is
> because we might call it in a loop, and we want the function to update
> that count.
> 
> And even your version still underflows from 0 to really-large-count.
> It *returns* when underflow happens, but you end up with the counter
> updated to a large value, and then anybody who uses it later would be
> screwed.
> 
> See, for example, the inline list_lru_walk() function in <linux/list_lru.h>
> 
> So I think we should either change that "unsigned long" to just
> "long", and then check for "<= 0" (like list_lru_walk() already does),
> or we should do
> 
>     if (!*nr_to_walk)
>         break;
>     --*nr_to_walk;
> 
> to make sure that we never do that underflow.

Yup, I missed that case. Thanks for finding and fixing it.

> I will modify your patch to do the latter, since it's the smaller
> change, but I suspect we should think about making that thing signed.

Yeah, I'll look into it. The shrinker API itself only ever feeds
shrinkctl->batch to it so we shouldn't ever have overflow problems
from that perspective...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ