linux-kernel - RE: [PATCH 1/3] slub: set a criteria for slub node partial adding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1323883989.2334.68.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Date:	Wed, 14 Dec 2011 18:33:09 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	"Alex,Shi" <alex.shi@...el.com>,
	David Rientjes <rientjes@...gle.com>,
	"penberg@...nel.org" <penberg@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: RE: [PATCH 1/3] slub: set a criteria for slub node partial adding

Le mercredi 14 décembre 2011 à 08:59 -0600, Christoph Lameter a écrit :

> Many people have done patchsets like this. 

Things changed a lot recently. There is room for improvements.

At least we can exchange ideas _before_ coding a new patchset ?

> There are various permutations
> on SL?B (I dont remember them all SLEB, SLXB, SLQB etc) that have been
> proposed over the years. Caches tend to grow and get rather numerous (see
> SLAB) and the design of SLUB was to counter that. There is a reason it was
> called SLUB. The U stands for Unqueued and was intended to avoid the
> excessive caching problems that I ended up when reworking SLAB for NUMA
> support.

Current 'one active slab' per cpu is a one level cache.

It really is a _queue_ containing a fair amount of objects.

'Unqueued' in SLUB is marketing hype :=)

When we have one producer (say network interrupt handler) feeding
millions of network packets to N consumers (other cpus), each free is
slowpath. They all want to touch page->freelist and slow the producer as
well because of false sharing.

Furthermore, when the producer hits socket queue limits, it mostly frees
skbs that were allocated in the 'not very recent past', and its own
freeing also hit slow path (because memory blocks of the skb are no
longer in the current active slab). It competes with frees done by
consumers as well.

Adding a second _small_ cache to queue X objects per cpu would help to
keep the active slab longer and more 'private' (its 'struct page' not
touched too often by other cpus) for a given cpu.

It would limit number of cache line misses we currently have because of
conflicting accesses to page->freelist just to push one _single_ object
(and n->list_lock in less extent)

My initial idea would be to use a cache of 4 slots per cpu, but be able
to queue many objects per slot, if they all belong to same slab/page.

In case we must make room in the cache (all slots occupied), we take one
slot and dequeue all objects in one round. No extra latency compared to
current schem.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/