lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 7 Mar 2007 19:20:12 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Michael K. Edwards" <medwards.linux@...il.com>
cc:	Anton Blanchard <anton@...ba.org>,
	Eric Dumazet <dada1@...mosbay.com>,
	Davide Libenzi <davidel@...ilserver.org>,
	Avi Kivity <avi@...o.co.il>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [patch] epoll use a single inode ...



On Wed, 7 Mar 2007, Michael K. Edwards wrote:
> 
> People's prejudices against prefetch instructions are sometimes
> traceable to the 3DNow! prefetch(w) botch, which some processors
> "support" as no-ops and others are too aggressive about (Opteron
> prefetches are reputed to be "strong", i. e., not dropped on DTLB
> miss).

No, I just checked, and Intel's own optimization manual makes it clear 
that you should be careful. They talk about performance penalties due to 
resource constraints - which makes tons of sense with a core that is good 
at handling its own resources and could quite possibly use those resources 
better to actually execute the loads and stores deeper down the 
instruction pipeline.

So it's not just 3DNow! making AMD look bad, or Intel would obviously 
suggest people use it out of the wazoo ;)

> XScale gets it right.

Blah. XScale isn't even an OoO CPU, *of*course* it needs prefetching. 
Calling that "getting it right" is ludicrous. If anything, it gets things 
so wrong that prefetching is *required* for good performance.

I'm talking about real CPU's with real memory pipelines that already do 
prefetching in hardware. The better the core is, the less the prefetch 
helps (and often the more it hurts in comparison to how much it helps).

But if you mean "doesn't try to fill the TLB on data prefetches", then 
yes, that's generally the right thing to do.

> (Oddly, Prescott seems to have initiated a page table walk on DTLB miss 
> during software prefetch -- just one of many weird Prescott flaws.)  

Netburst in general is *very* happy to do speculative TLB fills, I think.

> I'm guessing Pentium M and its descendants (Core Solo and Duo) get it 
> right but I'm having a hell of a time finding out for sure.  Can any of 
> the x86 experts answer this?

I just suspect that the upside for Core 2 Due is likely fairly low. The L2 
cache is good, the memory re-ordering is working.. I doubt "prefetch" 
helps in generic code that much for things like linked list following, you 
should probably limit it to code that has *known* access patterns and you 
know it's not going to be in the cache.

(In other words, I bet prefetching can help a lot with MMX/media kind of 
code, I doubt it's a huge win for "for_each_entry()")

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ