lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 02 Apr 2020 16:31:59 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Dongli Zhang <dongli.zhang@...cle.com>,
        Corentin Labbe <clabbe.montjoie@...il.com>,
        qemu-discuss@...gnu.org, mingo@...hat.com, bp@...en8.de,
        hpa@...or.com, x86@...nel.org
Cc:     linux-kernel@...r.kernel.org
Subject: Re: qemu-x86: kernel panic when host is loaded

Dongli Zhang <dongli.zhang@...cle.com> writes:
> On 4/2/20 2:57 AM, Thomas Gleixner wrote:
>> Corentin Labbe <clabbe.montjoie@...il.com> writes:
>>> On our kernelci lab, each qemu worker pass an healtcheck job each day and after each job failure, so it is heavily used.
>>> The healtcheck job is a Linux boot with a stable release.
>>>
>>> Since we upgraded our worker to buster, the qemu x86_64 healthcheck randomly panic with:
>>> <0>[    0.009000] Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug and send a report.  Then try booting with the 'noapic' option.
>>>
>>> After some test I found the source of this kernel panic, the host is
>>> loaded and qemu run "slower".  Simply renicing all qemu removed this
>>> behavour.
>>>
>>> So now what can I do ?
>>> Appart renicing qemu process, does something could be done ?
>> 
>> As the qemu timer/ioapic routing is actually sane, you might try to add
>> "no_timer_check" to the kernel command line.
>> 
>
> The no_timer_check is already permanently disabled in below commit?
>
> commit a90ede7b17d1 ("KVM: x86: paravirt skip pit-through-ioapic boot check")

Which only helps if the guest kernel has CONFIG_KVM_GUEST enabled...

As Corentin showed that it dies in the timer check this is clearly not
the case. So adding it to the kernel command line for this case should
work around the problem.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ