lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjGzJwjrvxg5YZ0Z@audible.transient.net>
Date:   Wed, 16 Mar 2022 09:51:35 +0000
From:   Jamie Heilman <jamie@...ible.transient.net>
To:     linux-kernel@...r.kernel.org
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        Peter Zijlstra <peterz@...radead.org>
Subject: system locks up with CONFIG_SLS=Y; 5.17.0-rc

I've been (somewhat unsuccessfully) trying to bisect a hard lock-up
of my workstation that occurs when I'm running 5.17 rc kernels a few
seconds after I start a kvm guest instance.  There is no output to
any log, everything locks up completely, sysrq doesn't even work
anymore.  As bisection progressed closer and closer to the branch
where straight-line-speculation mitigation was enabled, and as bisect
landing me between 9cdbeec40968 ("x86/entry_32: Fix segment exceptions")
and 3411506550b1 ("x86/csum: Rewrite/optimize csum_partial()") wasn't
resulting in clear results (my system definately starts Oopsing and
gets so hosed up that I'm forced to reboot, but it isn't quite as dire
as sysrq continues to function) I decided to just try a build with
CONFIG_SLS disabled, and it turns out that works just fine.  Sooo...

This system uses a Intel Core2 Duo E8400 processor.
working config (CONFIG_SLS=N) and dmesg at:
http://audible.transient.net/~jamie/k/sls.config-5.17.0-rc8
http://audible.transient.net/~jamie/k/sls.dmesg

(I don't think the dmesg of CONFIG_SLS=Y is really any different.)

As far as I know the guest kernel I hand to qemu doesn't really
matter, but the gist of my qemu command line is:

qemu-system-x86_64 -m 2048 -name "$NAME" -machine pc,accel=kvm \
    -nographic -no-user-config -nodefaults -boot strict=on \
    -rtc base=utc -smp 1,sockets=1,cores=1,threads=1 \
    -chardev pipe,id=char0,path="$DIR/monitor" \
    -chardev pty,id=char1 \
    -device isa-serial,chardev=char1 \
    -device virtio-blk-pci,drive=blk0,bootindex=1 \
    -device virtio-net-pci,netdev=net0,"mac=$IF_MAC" \
    -device virtio-rng-pci,rng=rng0,max-bytes=1024,period=3000 \
    -drive "id=blk0,file=/dev/S/$NAME,if=none,format=raw,cache=none" \
    -mon chardev=char0,id=monitor,mode=control \
    -netdev "tap,id=net0,ifname=$NAME,script=no,downscript=no" \
    -object rng-random,id=rng0,filename=/dev/random


No clue what additional debugging would help to enable here, if
anything.  As you can see from the dmesg, I'm using gcc 11.2.0 from
Debian unstable, 4:11.2.0-2 to be exact.  Let me know what other
information would be useful.

-- 
Jamie Heilman                     http://audible.transient.net/~jamie/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ