lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.00.1202011808240.22725@twin.jikos.cz>
Date:	Wed, 1 Feb 2012 18:11:29 +0100 (CET)
From:	Jiri Kosina <jkosina@...e.cz>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Colin Walters <walters@...bum.org>, Jan Kara <jack@...e.cz>,
	LKML <linux-kernel@...r.kernel.org>, linux-ia64@...r.kernel.org,
	dsterba@...e.cz, ptesarik@...e.cz, rguenther@...e.de,
	gcc@....gnu.org
Subject: Re: Memory corruption due to word sharing

On Wed, 1 Feb 2012, Linus Torvalds wrote:

> But the compiler turns the access to the bitfield (in a 32-bit aligned
> word) into a 64-bit access that accesses the word *next* to it.
> 
> That word next to it might *be* the lock, for example.
> 
> So we could literally have this kind of situation:
> 
>    struct {
>       atomic_t counter;
>       unsigned int val:4, other:4, data:24;
>    };
> 
> and if we write code like this:
> 
>     spin_lock(&somelock);
>     s->data++;
>     spin_unlock(&somelock);
> 
> and on another CPU we might do
> 
>    atomic_inc(&counter);
> 
> and the access to the bitfield will *corrupt* the atomic counter, even
> though both of them are perfectly fine!
> 
> Quite frankly, if the bug is simply because gcc doesn't actually know
> or care about the underlying size of the bitmask, it is possible that
> we can find a case where gcc clearly is buggy even according to the
> original C rules.
> 
> Honza - since you have access to the compiler in question, try
> compiling this trivial test-program:
> 
> 
>    struct example {
>       volatile int a;
>       int b:1;
>    };
> 
>    ..
>      s->b = 1;
>    ..
> 
> and if that bitfield access actually does a 64-bit access that also
> touches 's->a', then dammit, that's a clear violation of even the
> *old* C standard, and the gcc people cannot just wave away their bugs
> by saying "we've got standads, pttthththt".
> 
> And I suspect it really is a generic bug that can be shown even with
> the above trivial example.

I have actually tried exactly this earlier today (because while looking at 
this, I had an idea that putting volatile in place could be a workaround, 
causing gcc to generate a saner code), but it doesn't work either:

# cat x.c
struct x {
    long a;
    volatile unsigned int lock;
    unsigned int full:1;
};

void
wrong(struct x *ptr)
{
        ptr->full = 1;
}

int main()
{
        wrong(0);
}
# gcc -O2 x.c
# gdb -q ./a.out 
Reading symbols from /root/a.out...done.
(gdb) disassemble wrong
Dump of assembler code for function wrong:
   0x40000000000005c0 <+0>:     [MMI]       adds r32=8,r32
   0x40000000000005c1 <+1>:                 nop.m 0x0
   0x40000000000005c2 <+2>:                 mov r15=1;;
   0x40000000000005d0 <+16>:    [MMI]       ld8 r14=[r32];;
   0x40000000000005d1 <+17>:                nop.m 0x0
   0x40000000000005d2 <+18>:                dep r14=r15,r14,32,1;;
   0x40000000000005e0 <+32>:    [MIB]       st8 [r32]=r14
   0x40000000000005e1 <+33>:                nop.i 0x0
   0x40000000000005e2 <+34>:                br.ret.sptk.many b0;;

In my opinion, this is a clear bug in gcc (while the original problem, 
without explitict volatile, is not a C spec violation per se, it's just 
very inconvenient :) ).

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ