lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20100208221318.GL11483@discord.disaster>
Date:	Tue, 9 Feb 2010 09:13:18 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Nick Piggin <npiggin@...e.de>
Cc:	Christoph Lameter <cl@...ux-foundation.org>, tytso@....edu,
	Andi Kleen <andi@...stfloor.org>,
	Miklos Szeredi <miklos@...redi.hu>,
	Alexander Viro <viro@....linux.org.uk>,
	Christoph Hellwig <hch@...radead.org>,
	Christoph Lameter <clameter@....com>,
	Rik van Riel <riel@...hat.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	akpm@...ux-foundation.org, Nick Piggin <nickpiggin@...oo.com.au>,
	Hugh Dickins <hugh@...itas.com>, linux-kernel@...r.kernel.org
Subject: Re: inodes: Support generic defragmentation

On Mon, Feb 08, 2010 at 06:37:53PM +1100, Nick Piggin wrote:
> On Thu, Feb 04, 2010 at 11:13:15AM -0600, Christoph Lameter wrote:
> > On Thu, 4 Feb 2010, Nick Piggin wrote:
> > 
> > > Well what I described is to do the slab pinning from the reclaim path
> > > (rather than from slab calling into the subsystem). All slab locking
> > > basically "innermost", so you can pretty much poke the slab layer as
> > > much as you like from the subsystem.
> > 
> > Reclaim/defrag is called from the reclaim path (of the VM). We could
> > enable a call from the fs reclaim code into the slab. But how would this
> > work?
> 
> Well the exact details will depend, but I feel that things should
> get easier because you pin the object (and therefore the slab) via
> the normal and well tested reclaim paths.
> 
> So for example, for dcache, you will come in and take the normal
> locks: dcache_lock, sb_lock, pin the sb, umount_lock. At which
> point you have pinned dentries without changing any locking. So
> then you can find the first entry on the LRU, and should be able
> to then build a list of dentries on the same slab.
> 
> You still have the potential issue of now finding objects that would
> not be visible by searching the LRU alone. However at least the
> locking should be simplified.

Very true, but that leads us to the same problem of fragmented
caches because we empty unused objects off slabs that are still
pinned by hot objects and don't free the page. I agree that we can't
totally avoid this problem, but I still think that using an object
based LRU for reclaim has a fundamental mismatch with page based
reclaim that makes this problem worse than it could be.

FWIW, if we change the above to keeping a page based LRU in the slab
cache and the slab picks a page to reclaim, then the problem goes
mostly away, I think. We don't need to pin the slab to select and
prepare a page to reclaim - the cache only needs to be locked before
it starts reclaim. I think this has a much better chance of
reclaiming entire pages in situations where LRU based reclaim will
leave fragmentation.

i.e. instead of:

	shrink_slab
	  -> external shrinker
	    -> lock cache
	    -> find reclaimable object
	      -> call into slab w/ object
	        -> return longer list of objects
	    -> reclaim objects

we do:

	shrink_slab
	  -> internal shrinker
	    -> find oldest page and make object list
	      -> external shrinker
	        -> lock cache
		-> reclaim objects

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ