linux-kernel - Re: [00/17] Large Blocksize Support V3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1177747448.28223.26.camel@twins>
Date:	Sat, 28 Apr 2007 10:04:08 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Nick Piggin <nickpiggin@...oo.com.au>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	David Chinner <dgc@....com>,
	Christoph Lameter <clameter@....com>,
	linux-kernel@...r.kernel.org, Mel Gorman <mel@...net.ie>,
	William Lee Irwin III <wli@...omorphy.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	Badari Pulavarty <pbadari@...il.com>,
	Maxim Levitsky <maximlevitsky@...il.com>
Subject: Re: [00/17] Large Blocksize Support V3

On Sat, 2007-04-28 at 11:43 +1000, Nick Piggin wrote:
> Andrew Morton wrote:

> > For example, see __do_page_cache_readahead().  It does a read_lock() and a
> > page allocation and a radix-tree lookup for each page.  We can vastly
> > improve that.
> > 
> > Step 1:
> > 
> > - do a read-lock
> > 
> > - do a radix-tree walk to work out how many pages are missing
> > 
> > - read-unlock
> > 
> > - allocate that many pages
> > 
> > - read_lock()
> > 
> > - populate all the pages.
> > 
> > - read_unlock
> > 
> > - if any pages are left over, free them
> > 
> > - if we ended up not having enough pages, redo the whole thing.
> > 
> > that will reduce the number of read_lock()s, read_unlock()s and radix-tree
> > descents by a factor of 32 or so in this testcase.  That's a lot, and it's
> > something we (Nick ;)) should have done ages ago.
> 
> We can do pretty well with the lockless radix tree (that is already upstream)
> there. I split that stuff out of my most recent lockless pagecache patchset,
> because it doesn't require the "scary" speculative refcount stuff of the
> lockless pagecache proper. Subject: [patch 5/9] mm: lockless probe.
> 
> So that is something we could merge pretty soon.
> 
> The other thing is that we can batch up pagecache page insertions for bulk
> writes as well (that is. write(2) with buffer size > page size). I should
> have a patch somewhere for that as well if anyone interested.

Together with the optimistic locking from my concurrent pagecache that
should bring most of the gains:

sequential insert of 8388608 items:

CONFIG_RADIX_TREE_CONCURRENT=n

[ffff81007d7f60c0] insert 0 done in 15286 ms

CONFIG_RADIX_TREE_OPTIMISTIC=y

[ffff81006b36e040] insert 0 done in 3443 ms

only 4.4 times faster, and more scalable, since we don't bounce the
upper level locks around.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/