lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 01 Mar 2014 08:50:17 -0800 From: "H. Peter Anvin" <hpa@...or.com> To: Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...nel.org> CC: Steven Rostedt <rostedt@...dmis.org>, Peter Zijlstra <peterz@...radead.org>, Vince Weaver <vincent.weaver@...ne.edu>, Linux Kernel <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>, Jiri Olsa <jolsa@...hat.com>, Arnaldo Carvalho de Melo <acme@...stprotocols.net> Subject: Re: perf_fuzzer compiled for x32 causes reboot The bottom line is that if we want hard numbers we probably have to measure. Hoisting the cr2 read is a no-brainer, might even help performance... On March 1, 2014 1:50:42 AM PST, Borislav Petkov <bp@...en8.de> wrote: >On Sat, Mar 01, 2014 at 10:16:50AM +0100, Ingo Molnar wrote: >> >> * Steven Rostedt <rostedt@...dmis.org> wrote: >> >> > > Also, this function is called a _LOT_ under certain workloads, I >> > > don't know how cheap a CR2 read is, but it had better be really >> > > cheap. >> > >> > That's a HPA question. >> >> We read CR2 in the page fault hot path, so it's on the top of CPU >> architects' minds and it's reasonably optimized. A couple of cycles >> IIRC, but would be nice to hear actual numbers. > >Yeah, we were discussing this last night on IRC. > >And hpa actually meant that the optimization potential was there but no >one did do it, except maybe Transmeta. :-) > >So the expensive thing is writing to CR2 because it is a serializing >instruction. In fact, all writes to control registers except CR8 are >serializing. > >The reading from CR2 should be cheaper but not as cheap as a normal >MOV %reg %reg is. On AMD, MOV %reg, %cr2 is done with microcode so >definitely at least a couple of cycles and I'd guess it is not a >trivial >MOV on Intel too. > >Maybe a way to hide this cost is the OoO, as hpa suggested, depending >on >how much parallelism that particular code region can offer (serializing >instructions close by). -- Sent from my mobile phone. Please pardon brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists