lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 29 Sep 2007 10:47:12 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Christoph Lameter <clameter@....com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Christoph Hellwig <hch@....de>, Mel Gorman <mel@...net.ie>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	David Chinner <dgc@....com>, Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK


On Sat, 2007-09-29 at 01:13 -0700, Andrew Morton wrote:
> On Fri, 28 Sep 2007 20:25:50 +0200 Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> 
> > 
> > On Fri, 2007-09-28 at 11:20 -0700, Christoph Lameter wrote:
> > 
> > > > start 2 processes that each mmap a separate 64M file, and which does
> > > > sequential writes on them. start a 3th process that does the same with
> > > > 64M anonymous.
> > > > 
> > > > wait for a while, and you'll see order=1 failures.
> > > 
> > > Really? That means we can no longer even allocate stacks for forking.
> > > 
> > > Its surprising that neither lumpy reclaim nor the mobility patches can 
> > > deal with it? Lumpy reclaim should be able to free neighboring pages to 
> > > avoid the order 1 failure unless there are lots of pinned pages.
> > > 
> > > I guess then that lots of pages are pinned through I/O?
> > 
> > memory got massively fragemented, as anti-frag gets easily defeated.
> > setting min_free_kbytes to 12M does seem to solve it - it forces 2 max
> > order blocks to stay available, so we don't mix types. however 12M on
> > 128M is rather a lot.
> > 
> > its still on my todo list to look at it further..
> > 
> 
> That would be really really bad (as in: patch-dropping time) if those
> order-1 allocations are not atomic.
> 
> What's the callsite? 

Ah, right, that was the detail... all this lumpy reclaim is useless for
atomic allocations. And with SLUB using higher order pages, atomic !0
order allocations will be very very common.

One I can remember was:

  add_to_page_cache()
    radix_tree_insert()
      radix_tree_node_alloc()
        kmem_cache_alloc()

which is an atomic callsite.

Which leaves us in a situation where we can load pages, because there is
free memory, but can't manage to allocate memory to track them.. 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ