lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Mar 2019 09:22:22 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Alexander Potapenko <glider@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        James Y Knight <jyknight@...gle.com>
Subject: Re: Potentially missing "memory" clobbers in bitops.h for x86

On Thu, Mar 28, 2019 at 03:14:12PM +0100, Alexander Potapenko wrote:
> Hello,
> 
> arch/x86/include/asm/bitops.h defines clear_bit(nr, addr) for
> non-constant |nr| values as follows:
> 
> void clear_bit(long nr, volatile unsigned long *addr) {
>   asm volatile("lock; btr %1,%0"
>     : "+m"(*(volatile long *)addr)
>     : "Ir" (nr));
>   }
> (https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/bitops.h#L111)
> 
> According to the comments in the file, |nr| may be arbitrarily large.
> However the assembly constraints only imply that the first unsigned
> long value at |addr| is written to.
> This may result in the compiler ignoring the effect of the asm directive.
> 
> Consider the following example (https://godbolt.org/z/naTmjn):
> 
>   #include <stdio.h>
>   void clear_bit(long nr, volatile unsigned long *addr) {
>     asm volatile("lock; btr %1,%0"
>       : "+m"(*(volatile long *)addr)
>       : "Ir" (nr));
>   }
> 
>   unsigned long foo() {
>     unsigned long addr[2] = {1, 2};
>     clear_bit(65, addr);
>     return addr[0] + addr[1];
>   }
> 
>   int main() {
>     printf("foo: %lu\n", foo());
>   }
> 
> Depending on the optimization level, the program may print either 1
> (for -O0 and -O1) or 3 (for -O2 and -O3).
> This is because on higher optimization levels GCC assumes that addr[1]
> is unchanged and directly propagates the constant to the result.
> 
> I suspect the definitions of clear_bit() and similar functions are
> lacking the "memory" clobber.
> But the whole file tends to be very picky about whether this clobber
> needs to be applied in each case, so in the case of a performance
> penalty we may need to consider alternative approaches to fixing this
> code.

Is there a way of indicating a clobber only on the specific location
affected?  I suppose that one approach would be to calculate in C code
a pointer to the specific element of the addr[] array, which would
put the specific clobbered memory location onto the outputs list.
Or keep the current calculation, but also add "addr[nr / sizeof(long)]"
to the output list, thus telling the compiler exactly what is being
clobbered.  Assuming that actually works...

Of course, this would force the compiler to actually compute the
offset, which would slow things down.  I have no idea whether this
would be better or worse than just using the "memory" clobber.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ