lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170302163813.GE27998@distanz.ch>
Date:   Thu, 2 Mar 2017 17:38:13 +0100
From:   Tobias Klauser <tklauser@...tanz.ch>
To:     Guenter Roeck <linux@...ck-us.net>
Cc:     Sven Schmidt <4sschmid@...ormatik.uni-hamburg.de>,
        Sandra Loosemore <sandra@...esourcery.com>,
        Arnd Bergmann <arnd@...db.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, Ley Foon Tan <lftan@...era.com>,
        nios2-dev@...ts.rocketboards.org
Subject: Re: nios2 crash/hang in mainline due to 'lib: update LZ4 compressor
 module'

On 2017-03-01 at 20:45:21 +0100, Guenter Roeck <linux@...ck-us.net> wrote:
> On Wed, Mar 01, 2017 at 07:58:17PM +0100, Sven Schmidt wrote:
> > Hi Guenter, Tobias and Sandra,
> > 
> > thanks for your effort here.
> > 
> > On Tue, Feb 28, 2017 at 10:14:13AM -0800, Guenter Roeck wrote:
> > > On Tue, Feb 28, 2017 at 10:53:56AM -0700, Sandra Loosemore wrote:
> > > > On 02/28/2017 08:53 AM, Tobias Klauser wrote:
> > > > >(adding Sandra Loosemore to Cc due to possible relation to gcc/binutils
> > > > >for nios2)
> > > > >
> > > > >On 2017-02-26 at 22:03:38 +0100, Guenter Roeck <linux@...ck-us.net> wrote:
> > > > >>Hi Sven,
> > > > >>
> > > > >>my qemu test for nios2 started failing with commit 4e1a33b105dd ("lib:
> > > > >>update LZ4 compressor module"). The test hangs early during boot before
> > > > >>any console output is seen. Reverting the offending patch as well as the
> > > > >>subsequent lz4 related patches fixes the problem. Disabling CONFIG_RD_LZ4
> > > > >>and with it other LZ4 options also fixes it (as does adding "return -EINVAL;"
> > > > >>at the top of the LZ4 decompression code). For reference, bisect log
> > > > >>is attached.
> > > > >>
> > > > >>I tried with buildroot toolchains using gcc 6.1.0 as well as 6.3.0
> > > > >>and binutils 2.26.1. Scripts used to run the tests are available at
> > > > >>https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2.
> > > > >>Qemu is from qemu mainline or qemu v2.8 with nios2 patches applied.
> > > > >
> > > > >Looks like this is somehow related to gcc/binutils. Using GCC 4.8.3 and
> > > > >binutils 2.24.51 (both from from Sourcery CodeBench Lite 2014.05) I can
> > > > >get a kernel booting on latest master branch. AFAICT, none of the
> > > > >LZ4_decompress_* functions are called during boot.
> > > > >
> > 
> > It seems a bit strange that code which is not actually called causes problems like that.
> > 
> Yes, it is, though it is always possible. The code isn't exactly easy to
> understand; there may be some hidden caveats such as global variables. It may
> also be that some jump target exceeds its range (though why that would only
> be seen with the LZ4 code is another question), or that the compiler gets
> confused by the forced inlines (disabling that didn't make a difference,
> though, nor did disabling -O3).
> 
> > Please let me know if and how I may help you figure out what's happening, especially
> > regarding the differences between the previous LZ4 and the current implementation.
> > 
> 
> For my part I am all but clueless. Unless someone has an idea, we may to
> disable LZ4 support for nios2 for the time being. Does anyone have thoughts
> on that ? Of course, that would not help if the problem also affects
> recent gcc/binutil versions on other architectures.

After some further investigations, I'd say this isn't "caused" by LZ4
specifically but by a more general problem with one of the nios2 arch
specific tools involved.

I manually enabled random additional CONFIG_* options and in some cases
I got the kernel to boot (with CONFIG_RD_LZ4 enabled and no return
-EINVAL in place) while in others I didn't. So I'd rather suspect this
problem to be connected to the size or structure of the generated vmlinux
image.

Or could this even be a problem with qemu? Did anyone already verify
this on the 10m50 devboard? (Unfortunately I don't have any nios2
devboard available right now, otherwise I would have done this...)

Other than that I'm also becoming all but clueless... One option I
thought of was using the QEMU monitor to dump the CPU state after the
hang but so far I didn't manage to get it to work (hints appreciated ;)

Thanks
Tobias

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ