linux-kernel - Re: [PATCH] random: Fix kernel panic due to system

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57D9CC0D.7010608@hpe.com>
Date:   Wed, 14 Sep 2016 18:15:41 -0400
From:   Waiman Long <waiman.long@....com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
CC:     Theodore Ts'o <tytso@....edu>, Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Scott J Norton <scott.norton@....com>,
        Douglas Hatch <doug.hatch@....com>
Subject: Re: [PATCH] random: Fix kernel panic due to system_wq use before
 init

On 09/14/2016 05:06 PM, Linus Torvalds wrote:
> On Wed, Sep 14, 2016 at 12:34 PM, Waiman Long<waiman.long@....com>  wrote:
>> I can try, but the 16-socket system that I have at the moment takes a long
>> time (more than an hour) for one shutdown-reboot cycle. It may not be really
>> more interrupts in 4.8, it may be that the random driver just somehow run
>> very slow on my test machine as it seems to have a major rewrite in the 4.8
>> cycle.
> Looking at the random driver updates since 4.7, the only thing I see
> is that .crng_fast_load() for the chacha20 randomness. And that should
> trigger only until it's been initialized, so the cost looks like it
> should be limited.
>
> Is there some fundamental reason you think it's the random driver?
> Other than the oops? Because I'd be more inclined to suspect just some
> apic issue or something, where an actual interrupt line ends up
> screaming or whatever. Is this UV? There's also the CPU hotplug state
> machine changes etc.

Yes, it is because of the oops that I suspect the random driver may be 
the cause.


> But a few rounds of bisecting should hopefully cut down on the
> suspects a lot. A *full* bisect might be 16-17 rounds, but if you can
> do just four or five rounds of bisection, that should still cut it
> down from 14k commits to "only" several hundred..
>
>                 Linus

Yes, I will do a few rounds to see if we can isolate the problem. In the 
mean time, I will also reconfigure the system with less sockets to see 
if it is reproduced in a smaller configuration.

Cheers,
Longman