lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161108193011.GA15802@cmpxchg.org>
Date:   Tue, 8 Nov 2016 14:30:11 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Jan Kara <jack@...e.cz>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH 5/6] mm: workingset: switch shadow entry tracking to
 radix tree exceptional counting

On Tue, Nov 08, 2016 at 11:27:16AM +0100, Jan Kara wrote:
> On Mon 07-11-16 14:07:40, Johannes Weiner wrote:
> > Currently, we track the shadow entries in the page cache in the upper
> > bits of the radix_tree_node->count, behind the back of the radix tree
> > implementation. Because the radix tree code has no awareness of them,
> > we rely on random subtleties throughout the implementation (such as
> > the node->count != 1 check in the shrinking code which is meant to
> > exclude multi-entry nodes, but also happens to skip nodes with only
> > one shadow entry since it's accounted in the upper bits). This is
> > error prone and has, in fact, caused the bug fixed in d3798ae8c6f3
> > ("mm: filemap: don't plant shadow entries without radix tree node").
> > 
> > To remove these subtleties, this patch moves shadow entry tracking
> > from the upper bits of node->count to the existing counter for
> > exceptional entries. node->count goes back to being a simple counter
> > of valid entries in the tree node and can be shrunk to a single byte.
> 
> ...
> 
> > diff --git a/mm/truncate.c b/mm/truncate.c
> > index 6ae44571d4c7..d3ce5f261f47 100644
> > --- a/mm/truncate.c
> > +++ b/mm/truncate.c
> > @@ -53,7 +53,6 @@ static void clear_exceptional_entry(struct address_space *mapping,
> >  	mapping->nrexceptional--;
> >  	if (!node)
> >  		goto unlock;
> > -	workingset_node_shadows_dec(node);
> >  	/*
> >  	 * Don't track node without shadow entries.
> >  	 *
> > @@ -61,8 +60,7 @@ static void clear_exceptional_entry(struct address_space *mapping,
> >  	 * The list_empty() test is safe as node->private_list is
> >  	 * protected by mapping->tree_lock.
> >  	 */
> > -	if (!workingset_node_shadows(node) &&
> > -	    !list_empty(&node->private_list))
> > +	if (!node->exceptional && !list_empty(&node->private_list))
> >  		list_lru_del(&workingset_shadow_nodes,
> >  				&node->private_list);
> >  	__radix_tree_delete_node(&mapping->page_tree, node);
> 
> Is this really correct now? The radix tree implementation can move a single
> exceptional entry at index 0 from a node into a direct pointer and free
> the node while it is still in the LRU list. Or am I missing something?

You're right. I missed that scenario.

> To fix this I'd prefer to just have a callback from radix tree code when it
> is freeing a node, rather that trying to second-guess its implementation in
> the page-cache code...
> 
> Otherwise the patch looks good to me and I really like the simplification!

That's a good idea. I'll do away with __radix_tree_delete_node()
altogether and move not just the slot accounting but also the tree
shrinking and the maintenance callback into __radix_tree_replace().

The page cache can then simply do

__radix_tree_replace(&mapping->page_tree, node, slot, new,
                     workingset_node_update, mapping)

And workingset_node_update() gets called on every node that changes,
where it can track and untrack it depending on count & exceptional.

I'll give it some testing before posting it, but currently it's

 include/linux/radix-tree.h |   4 +-
 include/linux/swap.h       |   1 -
 lib/radix-tree.c           | 212 ++++++++++++++++++++-----------------------
 mm/filemap.c               |  48 +---------
 mm/truncate.c              |  16 +---
 mm/workingset.c            |  31 +++++--
 6 files changed, 134 insertions(+), 178 deletions(-)

on top of the simplifications of this patch 5/6.

Thanks for your input, Jan!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ