lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73f9e6d7-f5c0-4cdc-a9c4-dde3e2fb057c@roeck-us.net>
Date: Sat, 21 Sep 2024 14:08:00 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Vlastimil Babka <vbabka@...e.cz>,
 KUnit Development <kunit-dev@...glegroups.com>,
 Brendan Higgins <brendanhiggins@...gle.com>, David Gow <davidgow@...gle.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
 Joel Fernandes <joel@...lfernandes.org>,
 Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>,
 Christoph Lameter <cl@...ux.com>, David Rientjes <rientjes@...gle.com>,
 Steven Rostedt <rostedt@...dmis.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 Julia Lawall <Julia.Lawall@...ia.fr>, Jakub Kicinski <kuba@...nel.org>,
 "Jason A. Donenfeld" <Jason@...c4.com>,
 "Uladzislau Rezki (Sony)" <urezki@...il.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Hyeonggon Yoo <42.hyeyoo@...il.com>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, rcu@...r.kernel.org,
 Alexander Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>,
 Dmitry Vyukov <dvyukov@...gle.com>, kasan-dev@...glegroups.com,
 Jann Horn <jannh@...gle.com>, Mateusz Guzik <mjguzik@...il.com>
Subject: Re: [PATCH v2 7/7] kunit, slub: add test_kfree_rcu() and
 test_leak_destroy()

On 9/21/24 13:40, Vlastimil Babka wrote:
> +CC kunit folks
> 
> On 9/20/24 15:35, Guenter Roeck wrote:
>> Hi,
> 
> Hi,
> 
>> On Wed, Aug 07, 2024 at 12:31:20PM +0200, Vlastimil Babka wrote:
>>> Add a test that will create cache, allocate one object, kfree_rcu() it
>>> and attempt to destroy it. As long as the usage of kvfree_rcu_barrier()
>>> in kmem_cache_destroy() works correctly, there should be no warnings in
>>> dmesg and the test should pass.
>>>
>>> Additionally add a test_leak_destroy() test that leaks an object on
>>> purpose and verifies that kmem_cache_destroy() catches it.
>>>
>>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>>
>> This test case, when run, triggers a warning traceback.
>>
>> kmem_cache_destroy TestSlub_kfree_rcu: Slab cache still has objects when called from test_leak_destroy+0x70/0x11c
>> WARNING: CPU: 0 PID: 715 at mm/slab_common.c:511 kmem_cache_destroy+0x1dc/0x1e4
> 
> Yes that should be suppressed like the other slub_kunit tests do. I have
> assumed it's not that urgent because for example the KASAN kunit tests all
> produce tons of warnings and thus assumed it's in some way acceptable for
> kunit tests to do.
> 

I have all tests which generate warning backtraces disabled. Trying to identify
which warnings are noise and which warnings are on purpose doesn't scale,
so it is all or nothing for me. I tried earlier to introduce a patch series
which would enable selective backtrace suppression, but that died the death
of architecture maintainers not caring and people demanding it to be perfect
(meaning it only addressed WARNING: backtraces and not BUG: backtraces,
and apparently that wasn't good enough).

If the backtrace is intentional (and I think you are saying that it is),
I'll simply disable the test. That may be a bit counter-productive, but
there is really no alternative for me.

>> That is, however, not the worst of it. It also causes boot stalls on
>> several platforms and architectures (various arm platforms, arm64,
>> loongarch, various ppc, and various x86_64). Reverting it fixes the
>> problem. Bisect results are attached for reference.
> 
> OK, this part is unexpected. I assume you have the test built-in and not a
> module, otherwise it can't affect boot? And by stall you mean a delay or a

Yes.

> complete lockup? I've tried to reproduce that with virtme, but it seemed
> fine, maybe it's .config specific?

It is a complete lockup.

> 
> I do wonder about the placement of the call of kunit_run_all_tests() from
> kernel_init_freeable() as that's before a bunch of initialization finishes.
> 
> For example, system_state = SYSTEM_RUNNING; and rcu_end_inkernel_boot() only
> happens later in kernel_init(). I wouldn't be surprised if that means
> calling kfree_rcu() or rcu_barrier() or kvfree_rcu_barrier() as part of the
> slub tests is too early.
> 
> Does the diff below fix the problem? Any advice from kunit folks? I could
> perhaps possibly make the slab test module-only instead of tristate or do
> some ifdef builtin on the problematic tests, but maybe it wouldn't be
> necessary with kunit_run_all_tests() happening a bit later.
> 

It does, at least based on my limited testing. However, given that the
backtrace is apparently intentional, it doesn't really matter - I'll disable
the test instead.

Thanks,
Guenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ