lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 8 Aug 2023 12:46:48 -0700
From:   John Stultz <jstultz@...gle.com>
To:     "Jason A. Donenfeld" <Jason@...c4.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Waiman Long <longman@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Joel Fernandes <joelaf@...gle.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        kernel-team@...roid.com
Subject: Re: [RFC][PATCH 1/3] test-ww_mutex: Use prng instead of rng to avoid
 hangs at bootup

On Tue, Aug 8, 2023 at 11:20 AM John Stultz <jstultz@...gle.com> wrote:
> On Tue, Aug 8, 2023 at 7:05 AM Jason A. Donenfeld <Jason@...c4.com> wrote:
> > So, from my perspective, you shouldn't see any hang. That function
> > never blocks. I'm happy to look more into what's happening on your end
> > though. Maybe share your .config and qemu command line and I'll see if
> > I can repro?
>
> Yeah, it may just be that the real RNG is slow enough that I'm hitting
> the hung task watchdog?
> (I'm running with 64 cpus, so the test is trying to use 128 threads
> all hitting get_random_u32_below over and over to create their own
> random order of 16 locks)

Just following up on this point, I went through and disabled all the
hung task and delay detection (and pushed the rcu stall boundary up to
two minutes), and indeed the test did complete without actually
hanging. However, the test took something like 90 seconds to finish
using the get_random_u32_below() calls, whereas with this patch it
finishes in ~18s.

So indeed it's not blocking, just not fast enough to avoid the hung
task watchdogs in this admittedly contrived case (though one that has
been helpful in uncovering issues with proposed scheduler changes) .

I'll try to rework the commit message so the above is clear and resubmit.

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ