lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 8 Oct 2009 00:22:35 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andi Kleen <andi@...stfloor.org>, Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel@...r.kernel.org,
	Ravikiran G Thirumalai <kiran@...lex86.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [rfc][patch] store-free path walking

On Wed, Oct 07, 2009 at 02:57:20PM -0700, Linus Torvalds wrote:
> There's no question that prefetching cannot help, but it helps only if 
> it's about fetching data that you would need anyway early. In contrast, if 
> the option is "don't touch the other cacheline at all", prefetching is 
> _always_ a loss. No ifs, buts and maybes about it.

My point (probably not very well written expressed)
was that in a typical VFS operation there are hundreds
of cache lines touched for various things (code, global, various
objects, random stuff) and one more or less in the final dentry 
is not that big a difference in the global picture.
(ok I realize final in this case means the elements in the path)

Also typical operations don't do the same VFS operation in
a loop, but other things that cools the caches first
and then have to fetch everything in again.

I agree that touching more cache lines on the hash chain
walk for immediates would be dumb because there can
be potentially a lot of them, but the final referenced
ones are much fewer.

Or rather if minimizing total foot-print is the goal
there are lower hanging fruit than in the dentry itself
by just cutting fat from the whole path.

(e.g. I liked Mathieu's immediate value work recently
reposted because it had the nice potential to remove a lot of
"global" cache lines in such paths by pushing them
into the icache). 

And if it's possible to do less dcache locks or less 
looping in a seqlock by paying with one cache 
line that could be a reasonable trade off, assuming you
can hide the latency.

Or maybe not and I'm totally wrong on that.
I'll shut up on this now.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ