linux-kernel - Re: [PATCH 0/5] make slab gfp fair

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 16 May 2007 22:40:39 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Christoph Lameter <clameter@....com>
Cc:	Matt Mackall <mpm@...enic.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Thomas Graf <tgraf@...g.ch>,
	David Miller <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Daniel Phillips <phillips@...gle.com>,
	Pekka Enberg <penberg@...helsinki.fi>
Subject: Re: [PATCH 0/5] make slab gfp fair

On Wed, 2007-05-16 at 13:27 -0700, Christoph Lameter wrote:
> On Wed, 16 May 2007, Peter Zijlstra wrote:
> 
> > > So its no use on NUMA?
> > 
> > It is, its just that we're swapping very heavily at that point, a
> > bouncing cache-line will not significantly slow down the box compared to
> > waiting for block IO, will it?
> 
> How does all of this interact with
> 
> 1. cpusets
> 
> 2. dma allocations and highmem?
> 
> 3. Containers?

Much like the normal kmem_cache would do; I'm not changing any of the
page allocation semantics.

For containers it could be that the machine is not actually swapping but
the container will be in dire straights.

> > > The problem here is that you may spinlock and take out the slab for one 
> > > cpu but then (AFAICT) other cpus can still not get their high priority 
> > > allocs satisfied. Some comments follow.
> > 
> > All cpus are redirected to ->reserve_slab when the regular allocations
> > start to fail.
> 
> And the reserve slab is refilled from page allocator reserves if needed?

Yes, using new_slab(), exacly as it would normally be.

> > > But this is only working if we are using the slab after
> > > explicitly flushing the cpuslabs. Otherwise the slab may be full and we
> > > get to alloc_slab.
> > 
> > /me fails to parse.
> 
> s->cpu[cpu] is only NULL if the cpu slab was flushed. This is a pretty 
> rare case likely not worth checking.

Ah, right:
 - !page || !page->freelist
 - and no available partial slabs.

then we try the reserve (if we're entiteld).

> > > Remove the above two lines (they are wrong regardless) and simply make 
> > > this the cpu slab.
> > 
> > It need not be the same node; the reserve_slab is node agnostic.
> > So here the free page watermarks are good again, and we can forget all
> > about the ->reserve_slab. We just push it on the free/partial lists and
> > forget about it.
> > 
> > But like you said above: unfreeze_slab() should be good, since I don't
> > use the lockless_freelist.
> 
> You could completely bypass the regular allocation functions and do
> 
> object = s->reserve_slab->freelist;
> s->reserve_slab->freelist = object[s->reserve_slab->offset];

That is basically what happens at the end; if an object is returned from
the reserve slab.

But its wanted to try the normal cpu_slab path first to detect that the
situation has subsided and we can resume normal operation.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/