linux-kernel - Re: [rfc][patch] store-free path walking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091007222235.GC1656@one.firstfloor.org>
Date:	Thu, 8 Oct 2009 00:22:35 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andi Kleen <andi@...stfloor.org>, Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org,
	Ravikiran G Thirumalai <kiran@...lex86.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [rfc][patch] store-free path walking

On Wed, Oct 07, 2009 at 02:57:20PM -0700, Linus Torvalds wrote:
> There's no question that prefetching cannot help, but it helps only if 
> it's about fetching data that you would need anyway early. In contrast, if 
> the option is "don't touch the other cacheline at all", prefetching is 
> _always_ a loss. No ifs, buts and maybes about it.

My point (probably not very well written expressed)
was that in a typical VFS operation there are hundreds
of cache lines touched for various things (code, global, various
objects, random stuff) and one more or less in the final dentry 
is not that big a difference in the global picture.
(ok I realize final in this case means the elements in the path)

Also typical operations don't do the same VFS operation in
a loop, but other things that cools the caches first
and then have to fetch everything in again.

I agree that touching more cache lines on the hash chain
walk for immediates would be dumb because there can
be potentially a lot of them, but the final referenced
ones are much fewer.

Or rather if minimizing total foot-print is the goal
there are lower hanging fruit than in the dentry itself
by just cutting fat from the whole path.

(e.g. I liked Mathieu's immediate value work recently
reposted because it had the nice potential to remove a lot of
"global" cache lines in such paths by pushing them
into the icache). 

And if it's possible to do less dcache locks or less 
looping in a seqlock by paying with one cache 
line that could be a reasonable trade off, assuming you
can hide the latency.

Or maybe not and I'm totally wrong on that.
I'll shut up on this now.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/