linux-kernel - Re: [PATCH] shrinker: fix a bug when callback returns -1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1108310923320.22628@hs20-bc2-1.build.redhat.com>
Date:	Wed, 31 Aug 2011 09:31:29 -0400 (EDT)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Dave Chinner <david@...morbit.com>
cc:	Dave Chinner <dchinner@...hat.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Christoph Hellwig <hch@...radead.org>, dm-devel@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] shrinker: fix a bug when callback returns -1



On Wed, 31 Aug 2011, Dave Chinner wrote:

> On Tue, Aug 30, 2011 at 03:52:02PM -0400, Mikulas Patocka wrote:
> > On Tue, 30 Aug 2011, Dave Chinner wrote:
> > 
> > > On Mon, Aug 29, 2011 at 03:36:48PM -0400, Mikulas Patocka wrote:
> > > > Hi
> > > > 
> > > > This patch fixes a lockup when shrinker callback returns -1.
> > > 
> > > What lockup is that? I haven't seen any bug reports, and this code
> > > has been like this for several years, so I'm kind of wondering why
> > > this is suddenly an issue....
> > 
> > I got the lockups when modifying my own dm-bufio code to use the shrinker. 
> > The reason for lockups was that the variable total_scan contained 
> > extremely high values.
> 
> Your new shrinker was returning -1 when nr_to_scan == 0? That's not
> correct - you should be returning the count of objects (regardless
> of the specified gfp_mask) or 0 if you can't get one for whatever
> reason....
> 
> > The only possible way how such extreme values could be stored in 
> > total_scan was this:
> > 
> > max_pass = do_shrinker_shrink(shrinker, shrink, 0);
> > delta = (4 * nr_pages_scanned) / shrinker->seeks;
> > delta *= max_pass;
> > do_div(delta, lru_pages + 1);
> > total_scan += delta;
> > 
> > --- you don't test if do_shinker_shrink retuned -1 here. The variables are 
> > unsigned long, so you end up adding extreme value (approximately 
> > 2^64/(lru_pages+1) to total_scan.
> 
> That's not the only way to get large values in total scan. If
> you do any amount of GFP_NOFS/GFP_NOIO allocation and the shrinker
> aborts when it sees this, the shrinker->nr total will aggregate
> until it becomes large. total_scan contains that aggregation because
> it starts from the current value of shrinker->nr.
> 
> > Note that some existing shrinkers contain workaround for this (something 
> > like "return nr_to_scan ? -1 : 0",
> 
> That's not a workaround - that is exactly how the current API
> expects them to operate. That is, when counting objects, you return
> the count of objects. If you can't get the count, you return 0.

So apply this patch, that mentions this requirement in the specification 
and fixes the bug in fs/super.c.

> Did I mention I was rewriting the API to make it more sane, obvious
> and simple to implement correctly?
> 
> > while some can still return -1 when 
> > nr_to_scan is 0 and trigger this bug (prune_super).
> 
> prune_super() will only return -1 if grab_super_passive() fails,
> which indicates that something serious is happening on the
> superblock (like unmount, remount or freeze) in which case the
> caches are about to or already undergoing significant change anyway.
>
> It could be seen as a bug, but it's really a "don't care" case - it
> doesn't matter what the calculated value is because it doesn't
> matter what the shrinker does after such a failure - the next call
> is going to fail to grab the superblock, too.

It is a bug, because the shrinker will loop for 
2^64/(lru_pages+1)/batch_size times if that "-1" is returned at certain 
place --- in "max_pass = do_shrinker_shrink(shrinker, shrink, 0)".

Mikulas

> And FWIW, that wart also goes away with the shrinker API rework.
> 
> Cheers,
> 
> Dave.

---

Fix shrinker callback bug in fs/super.c

The callback must not return -1 when nr_to_scan is zero. Fix the bug in
fs/super.c and add this requirement to the callback specification.

CC: stable@...nel.org
Signed-off-by: Mikulas Patocka <mpatocka@...hat.com>

---
 fs/super.c               |    2 +-
 include/linux/shrinker.h |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Index: linux-3.1-rc3-fast/fs/super.c
===================================================================
--- linux-3.1-rc3-fast.orig/fs/super.c	2011-08-31 15:21:01.000000000 +0200
+++ linux-3.1-rc3-fast/fs/super.c	2011-08-31 15:21:21.000000000 +0200
@@ -61,7 +61,7 @@ static int prune_super(struct shrinker *
 		return -1;
 
 	if (!grab_super_passive(sb))
-		return -1;
+		return !sc->nr_to_scan ? 0 : -1;
 
 	if (sb->s_op && sb->s_op->nr_cached_objects)
 		fs_objects = sb->s_op->nr_cached_objects(sb);
Index: linux-3.1-rc3-fast/include/linux/shrinker.h
===================================================================
--- linux-3.1-rc3-fast.orig/include/linux/shrinker.h	2011-08-31 15:21:28.000000000 +0200
+++ linux-3.1-rc3-fast/include/linux/shrinker.h	2011-08-31 15:21:58.000000000 +0200
@@ -20,6 +20,7 @@ struct shrink_control {
  * 'nr_to_scan' entries and attempt to free them up.  It should return
  * the number of objects which remain in the cache.  If it returns -1, it means
  * it cannot do any scanning at this time (eg. there is a risk of deadlock).
+ * The callback must not return -1 if nr_to_scan is zero.
  *
  * The 'gfpmask' refers to the allocation we are currently trying to
  * fulfil.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/