linux-kernel - Re: for_each_domain()/sched_domain_span() has offline CPUs (was Re: [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xhsmhttkrbvfb.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
Date: Wed, 27 Mar 2024 15:28:56 +0100
From: Valentin Schneider <vschneid@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, Thomas Gleixner
 <tglx@...utronix.de>, LKML <linux-kernel@...r.kernel.org>, Ingo Molnar
 <mingo@...nel.org>, Anna-Maria Behnsen <anna-maria@...utronix.de>, Alex
 Shi <alexs@...nel.org>, Peter Zijlstra <peterz@...radead.org>, Vincent
 Guittot <vincent.guittot@...aro.org>, Barry Song
 <song.bao.hua@...ilicon.com>
Subject: Re: for_each_domain()/sched_domain_span() has offline CPUs (was Re:
 [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in
 nohz_full)

On 27/03/24 13:42, Frederic Weisbecker wrote:
> Le Tue, Mar 26, 2024 at 05:46:07PM +0100, Valentin Schneider a écrit :
>> > Then with that patch I ran TREE07, just some short iterations:
>> >
>> > tools/testing/selftests/rcutorture/bin/kvm.sh --configs "10*TREE07" --allcpus --bootargs "rcutorture.onoff_interval=200" --duration 2
>> >
>> > And the warning triggers very quickly. At least since v6.3 but maybe since
>> > earlier. Is this expected behaviour or am I right to assume that
>> > for_each_domain()/sched_domain_span() shouldn't return an offline CPU?
>> >
>> 
>> I would very much assume an offline CPU shouldn't show up in a
>> sched_domain_span().
>> 
>> Now, on top of the above, there's one more thing worth noting:
>>   cpu_up_down_serialize_trainwrecks()
>> 
>> This just flushes the cpuset work, so after that the sched_domain topology
>> should be sane. However I see it's invoked at the tail end of _cpu_down(),
>> IOW /after/ takedown_cpu() has run, which sounds too late. The comments
>> around this vs. lock ordering aren't very reassuring however, so I need to
>> look into this more.
>
> Ouch...
>
>> 
>> Maybe as a "quick" test to see if this is the right culprit, you could try
>> that with CONFIG_CPUSET=n? Because in that case the sched_domain update is
>> ran within sched_cpu_deactivate().
>
> I just tried and I fear that doesn't help. It still triggers even without
> cpusets :-s
>

What, you mean I can't always blame cgroups? What has the world come to?

That's interesting, it means the deferred work item isn't the (only)
issue. I'll grab your test patch and try to reproduce on TREE07.

> Thanks.