lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 25 May 2020 12:02:48 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>
Subject: Re: [RFC][PATCH 0/4] x86/entry: disallow #DB more

On 23/05/2020 23.32, Peter Zijlstra wrote:
> On Sat, May 23, 2020 at 02:59:40PM +0200, Peter Zijlstra wrote:
>> On Fri, May 22, 2020 at 03:13:57PM -0700, Andy Lutomirski wrote:
> 
>> Good point, so the trivial optimization is below. I couldn't find
>> instruction latency numbers for DRn load/stores anywhere. I'm hoping
>> loads are cheap.
> 
> +	u64 empty = 0, read = 0, write = 0;
> +	unsigned long dr7;
> +
> +	for (i=0; i<100; i++) {
> +		u64 s;
> +
> +		s = rdtsc();
> +		barrier_nospec();
> +		barrier_nospec();
> +		empty += rdtsc() - s;
> +
> +		s = rdtsc();
> +		barrier_nospec();
> +		dr7 = native_get_debugreg(7);
> +		barrier_nospec();
> +		read += rdtsc() - s;
> +
> +		s = rdtsc();
> +		barrier_nospec();
> +		native_set_debugreg(7, 0);
> +		barrier_nospec();
> +		write += rdtsc() - s;
> +	}
> +
> +	printk("XXX: %ld %ld %ld\n", empty, read, write);
> 
> 
> [    1.628125] XXX: 2800 2404 19600
> 
> IOW, reading DR7 is basically free, and certainly cheaper than looking
> at cpu_dr7 which would probably be an insta cache miss.
> 

Naive question: did you check disassembly to see whether gcc threw your
native_get_debugreg() away, given that the asm isn't volatile and the
result is not used for anything? Testing here only shows a "mov
%r9,%db7", but the read did seem to get thrown away.

Rasmus

Powered by blists - more mailing lists