lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3f5ff26b-9904-462e-ac22-84b5d212e9ff@kernel.org>
Date: Wed, 9 Apr 2025 13:58:13 +0200
From: Matthieu Baerts <matttbe@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Stanislav Fomichev <sdf@...ichev.me>, netdev@...r.kernel.org,
 davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
 paulmck@...nel.org, joel@...lfernandes.org, steven.price@....com,
 akpm@...ux-foundation.org, anshuman.khandual@....com,
 linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next] configs/debug: run and debug PREEMPT

Hi Jakub,

On 08/04/2025 21:03, Jakub Kicinski wrote:
> On Tue, 8 Apr 2025 20:18:26 +0200 Matthieu Baerts wrote:
>> On 02/04/2025 19:23, Stanislav Fomichev wrote:
>>> Recent change [0] resulted in a "BUG: using __this_cpu_read() in
>>> preemptible" splat [1]. PREEMPT kernels have additional requirements
>>> on what can and can not run with/without preemption enabled.
>>> Expose those constrains in the debug kernels.  
>>
>> Good idea to suggest this to find more bugs!
>>
>> I did some quick tests on my side with our CI, and the MPTCP selftests
>> seem to take a bit more time, but without impacting the results.
>> Hopefully, there will be no impact in slower/busy environments :)
> 
> What kind of slow down do you see? I think we get up to 50% more time
> spent in the longer tests.

That's difficult to measure in our CI because we have a majority of
tests either creating test envs with random parameters (latency, losses,
etc.), or waiting for a transfer at a limited speed to finish. Plus, we
don't control the host running our tests. But if we omit that, our
packetdrill tests take ~20% longer on the CI, and our 2 mini KUnit tests
took ~10% longer (275ms -> 305ms). Globally, our test suite took maybe
~10-20% longer, and that's acceptable.

So not 50%. Is this difference acceptable for NIPA? Even when some tests
are restarted automatically in case of instabilities?

One last thing, Stanislav's patch has been shared during Linus' merge
window: perhaps something else could also impact the time?

> Not sure how bad is too bad..

Did you observe more instabilities? Maybe the individual results should
be omitted, and only debug specific issues (calltraces, kmemleak, etc.)
should be looked at?

> I'm leaning
> towards applying this to net-next and we can see if people running
> on linux-next complain?

Good idea! But I do wonder how run **and monitor** the selftests in
linux-next with a debug kernel :)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ