linux-kernel - RE: On the optimum size of a batch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5c54bfe5123f4e6390400599427c23b7@AcuMS.aculab.com>
Date: Mon, 11 Mar 2024 21:42:46 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Matthew Wilcox' <willy@...radead.org>, Dave Chinner <david@...morbit.com>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>, "linux-fsdevel@...r.kernel.org"
	<linux-fsdevel@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: RE: On the optimum size of a batch

From: Matthew Wilcox
> Sent: 11 March 2024 03:42
..
> But that doesn't necessarily mean that you want a larger batch size.
> Because you're not just allocating, you're also freeing and over a
> large enough timescale the number of objects allocated and freed is
> approximately equal.  In the SLUB case, your batch size needs to be
> large enough to absorb most of the allcation-vs-free bias jitter; that
> is if you know they always alternate AFAFAFAFAF a batch size of 2 would
> be fine.  If you know you get four allocations followed by four frees,
> having a batch size of 5 woud be fine.  We'd never go to the parent
> allocator if we got a AFAAFFAAAFFFAAAAFFFFAAFFAFAAFAAFFF pattern.

That isn't really the allocation pattern you need to worry about.
Per cpu free list should be large enough to handle it.
The problem (as the first people doing a sparc SMP port found) is
that you'll get one bit of code that 'pumps' items from one
free list to another.
So you have to decide that you have too many local free objects
and then give some back to the global free list.
Keeping them on the global list as a block of 'n' items can
make things far better.
Indeed 'arrays of pointers' are likely to be better than a
linked list.

Caches in front of SLUB (or similar) really shouldn't be needed.
Except, perhaps, to indicate which list the items come from
and, maybe for some extra allocation stats.
There might be rcu oddities - where the memory can't be used
for a different structure. But there are probably alternative
solutions to that one.

The page free code is a different problem.
I suspect that needs to process batches of items to avoid
repeated (expensive) atomic accesses.
But it definitely needs to avoid thrashing the L1 cache.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)