lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 01 Mar 2014 08:50:17 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org>
CC:	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Vince Weaver <vincent.weaver@...ne.edu>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...hat.com>, Jiri Olsa <jolsa@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: perf_fuzzer compiled for x32 causes reboot

The bottom line is that if we want hard numbers we probably have to measure.

Hoisting the cr2 read is a no-brainer, might even help performance...

On March 1, 2014 1:50:42 AM PST, Borislav Petkov <bp@...en8.de> wrote:
>On Sat, Mar 01, 2014 at 10:16:50AM +0100, Ingo Molnar wrote:
>> 
>> * Steven Rostedt <rostedt@...dmis.org> wrote:
>> 
>> > > Also, this function is called a _LOT_ under certain workloads, I 
>> > > don't know how cheap a CR2 read is, but it had better be really 
>> > > cheap.
>> > 
>> > That's a HPA question.
>> 
>> We read CR2 in the page fault hot path, so it's on the top of CPU 
>> architects' minds and it's reasonably optimized. A couple of cycles 
>> IIRC, but would be nice to hear actual numbers.
>
>Yeah, we were discussing this last night on IRC.
>
>And hpa actually meant that the optimization potential was there but no
>one did do it, except maybe Transmeta. :-)
>
>So the expensive thing is writing to CR2 because it is a serializing
>instruction. In fact, all writes to control registers except CR8 are
>serializing.
>
>The reading from CR2 should be cheaper but not as cheap as a normal
>MOV %reg %reg is. On AMD, MOV %reg, %cr2 is done with microcode so
>definitely at least a couple of cycles and I'd guess it is not a
>trivial
>MOV on Intel too.
>
>Maybe a way to hide this cost is the OoO, as hpa suggested, depending
>on
>how much parallelism that particular code region can offer (serializing
>instructions close by).

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists