lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 6 Aug 2013 12:34:14 -0400
From:	Matthew Wilcox <willy@...ux.intel.com>
To:	Jan Kara <jack@...e.cz>
Cc:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	Hugh Dickins <hughd@...gle.com>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
	Andi Kleen <ak@...ux.intel.com>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Hillf Danton <dhillf@...il.com>, Dave Hansen <dave@...1.net>,
	Ning Qu <quning@...gle.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/23] radix-tree: implement preload for multiple
 contiguous elements

On Mon, Aug 05, 2013 at 01:17:39PM +0200, Jan Kara wrote:
> On Sun 04-08-13 05:17:03, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> > The radix tree is variable-height, so an insert operation not only has
> > to build the branch to its corresponding item, it also has to build the
> > branch to existing items if the size has to be increased (by
> > radix_tree_extend).
> > @@ -82,16 +82,24 @@ static struct kmem_cache *radix_tree_node_cachep;
> >   * The worst case is a zero height tree with just a single item at index 0,
> >   * and then inserting an item at index ULONG_MAX. This requires 2 new branches
> >   * of RADIX_TREE_MAX_PATH size to be created, with only the root node shared.
> > + *
> > + * Worst case for adding N contiguous items is adding entries at indexes
> > + * (ULONG_MAX - N) to ULONG_MAX. It requires nodes to insert single worst-case
> > + * item plus extra nodes if you cross the boundary from one node to the next.
> > + *
> >   * Hence:
> >   */
> > -#define RADIX_TREE_PRELOAD_SIZE (RADIX_TREE_MAX_PATH * 2 - 1)
> > +#define RADIX_TREE_PRELOAD_MIN (RADIX_TREE_MAX_PATH * 2 - 1)
> > +#define RADIX_TREE_PRELOAD_MAX \
> > +	(RADIX_TREE_PRELOAD_MIN + \
> > +	 DIV_ROUND_UP(RADIX_TREE_PRELOAD_NR - 1, RADIX_TREE_MAP_SIZE))
>
>   Umm, is this really correct? I see two problems:
> 1) You may need internal tree nodes at various levels but you seem to
> account only for the level 1.
> 2) The rounding doesn't seem right because RADIX_TREE_MAP_SIZE+2 nodes may
> require 3 nodes at level 1 if the indexes are like:
> i_0 | i_1 .. i_{RADIX_TREE_MAP_SIZE} | i_{RADIX_TREE_MAP_SIZE+1}
>     ^                                ^
>     node boundary                    node boundary
> 
>   Otherwise the patch looks good.

You are correct that in the fully general case, these things are needed,
and the patch undercounts the number of nodes needed.  However, in the
specific case of THP pagecache, insertions are naturally aligned, and
we end up needing very few internal nodes (so few that we've never hit
the end of this array in some fairly heavy testing).

There are two penalties for getting the general case correct.  One is
that the calculation becomes harder to understand, and the other is
that we consume more per-CPU memory.  I think we should document that
the current code requires "natural alignment", and include a note about
what things will need to change in order to support arbitrary alignment
in case anybody needs to do it in the future, but not include support
for arbitrary alignment in this patchset.

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ