lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1501280923410.31753@gentwo.org>
Date:	Wed, 28 Jan 2015 09:30:56 -0600 (CST)
From:	Christoph Lameter <cl@...ux.com>
To:	Joonsoo Kim <js1304@...il.com>
cc:	Joonsoo Kim <iamjoonsoo.kim@....com>, akpm@...uxfoundation.org,
	LKML <linux-kernel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	Pekka Enberg <penberg@...nel.org>, iamjoonsoo@....com,
	Jesper Dangaard Brouer <brouer@...hat.com>
Subject: Re: [RFC 1/3] Slab infrastructure for array operations

On Wed, 28 Jan 2015, Joonsoo Kim wrote:

> > GFP_SLAB_ARRAY new is best for large quantities in either allocator since
> > SLAB also has to construct local metadata structures.
>
> In case of SLAB, there is just a little more work to construct local metadata so
> GFP_SLAB_ARRAY_NEW would not show better performance
> than GFP_SLAB_ARRAY_LOCAL, because it would cause more overhead due to
> more page allocations. Because of this characteristic, I said that
> which option is
> the best is implementation specific and therefore we should not expose it.

For large amounts of objects (hundreds or higher) GFP_SLAB_ARRAY_LOCAL
will never have enough objects. GFP_SLAB_ARRAY_NEW will go to the page
allocator and bypass free table creation and all the queuing that objects
go through normally in SLAB. AFAICT its going to be a significant win.

A similar situation is true for the freeing operation. If the freeing
operation results in all objects in a page being freed then we can also
bypass that and put the page directly back into the page allocator (to be
implemented once we agree on an approach).

> Even if we narrow down the problem to the SLUB, choosing correct option is
> difficult enough. User should know how many objects are cached in this
> kmem_cache
> in order to choose best option since relative quantity would make
> performance difference.

Ok we can add a function call to calculate the number of objects cached
per cpu and per node? But then that is rather fluid and could change any
moment.

> And, how many objects are cached in this kmem_cache could be changed
> whenever implementation changed.

The default when no options are specified is to first exhaust the node
partial objects, then allocate new slabs as long as we have more than
objects per page left and only then satisfy from cpu local object. I think
that is satisfactory for the majority of the cases.

The detailed control options were requested at the meeting in Auckland at
the LCA. I am fine with dropping those if they do not make sense. Makes
the API and implementation simpler. Jesper, are you ok with this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ