lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140725035527.GA30108@pg-vmw-gw1>
Date:	Thu, 24 Jul 2014 20:55:28 -0700
From:	Alexei Starovoitov <ast@...nel.org>
To:	Michel Dänzer <michel@...nzer.net>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jakub Jelinek <jakub@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Debian GCC Maintainers <debian-gcc@...ts.debian.org>,
	Debian Kernel Team <debian-kernel@...ts.debian.org>
Subject: Re: Random panic in load_balance() with 3.16-rc

On Fri, Jul 25, 2014 at 10:25:03AM +0900, Michel Dänzer wrote:
> [ Adding the Debian kernel and gcc teams to Cc ]
> 
> >         movq    $load_balance_mask, -136(%rbp)  #, %sfp
> >         subq    $184, %rsp      #,
> > 
> > Anyway, this is not a kernel bug. This is your compiler creating
> > completely broken code. We may need to add a warning to make sure
> > nobody compiles with gcc-4.9.0, and the Debian people should probably
> > downgrate their shiny new compiler.
> 
> Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm
> going to try reproducing the problem with a kernel built by that now.

4.8 and 4.7 don't hit the problem on this test.
4.9 with -O2 compiles this file ok. 4.9 with -Os triggers it.

-mno-red-zone only affected prologue emition in gcc. This part didn't
change between the releases. So the bug is quite deep.
What seems to be happening is that 2nd pass of instruction scheduler
(after emit prologue and reg alloc) is ignoring barrier properties
of 'subq $184, %rsp' and moving 'movq $.., -136(%rbp)' instruction
ahead of it. afaik rtl sched was never aware of 'red-zone'.
As an ex-compiler guy, I'm worried that this bug exists in earlier
releases. rtl backend guys need to take a serious look at it.
imo this is very serious bug, since broken red-zone is extremelly
hard to debug.
There are two weak test in gcc testsuite related to -mno-red-zone,
but not a single test that actually check that it is doing
the right thing. It is scary. I hope I'm wrong with this analysis.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ