linux-kernel - Re: [rfc][patch] store-free path walking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091007164622.GX30316@wotan.suse.de>
Date:	Wed, 7 Oct 2009 18:46:22 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Jens Axboe <jens.axboe@...cle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org,
	Ravikiran G Thirumalai <kiran@...lex86.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [rfc][patch] store-free path walking

On Wed, Oct 07, 2009 at 09:27:59AM -0700, Linus Torvalds wrote:
> 
> 
> On Wed, 7 Oct 2009, Linus Torvalds wrote:
> > 
> > Hmm. Regardless, this very much does look like what I envisioned, apart 
> > from details like that. And maybe your per-dentry seqlock is the right 
> > choice. On x86, it certainly doesn't have the performance issues it could 
> > have in other places.
> 
> Actually, if we really want to do the per-dentry thing, then we should 
> change it a bit. Maybe rather than using a seqlock data structure (which 
> is really just a unsigned counter and a spinlock), we could do just the 
> unsigned counter, and use the d_lock as the spinlock for the sequence 
> lock.
> 
> The hackiest way to do that woudl be to get rid of d_lock entirely, 
> replace it with d_seqlock, and then just do
> 
> 	#define d_lock d_seqlock.lock
> 
> instead (but the dentry structure may well have layout issues that makes 
> that not work very well - we're mixing pointers and 'int'-sized things 
> and need to pack them well etc).
> 
> That would cut down the seqlock memory costs from 8 bytes (or more - just 
> the spinlock itself is currently 8 bytes on ia64, so on ia64 the seqlock 
> is actually 16 bytes, not to mention all the spinlock debugging cases) to 
> just four bytes.

Oh I did that, used a "seqcount" which is the bare sequence counter
(and update it while holding d_lock).

Yes it still has packing issues, athough I think I can get rid of
d_mounted so it will then pack nicely and size won't change. (just
have a flag if we are mounted at least once, and just store the
count elsewhere for mountpoints -- or even just search the mount
hash on each umount to see if anything is left mounted on it)

 
> However, I still suspect we could do things entirely without the seqlock. 
> The outer seqlock will handle the "couldn't find it" case, and I've got 
> the strongest feeling that we should be able to just use some basic memory 
> ordering on the dentry hash to make the inner seqlock unnecessary (ie 
> make sure that either we don't see the old entry at all, or that we can 
> guarantee that it won't trigger a successful compare while the rename is 
> in process because we set the dentry name length to zero).

Well, I would be all for improving things of course. But keep in
mind we already do the rename_lock seqcount for each d_lookup,
so the lock free lookup path is only doing extra seqlocks on dcache
hash collision cases.

But I do agree it needs more thought. I'll try to get the powerpc
guys interested in running tests for us tomorrow :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/