linux-kernel - Re: [RFC] sched/deadline: only mark active cpu as free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <0e69b462-1666-4a17-bde3-758d0d81a7d5@gmail.com>
Date: Tue, 14 Jan 2025 15:01:04 -0800
From: Doug Berger <opendmb@...il.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
 Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
 Florian Fainelli <florian.fainelli@...adcom.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [RFC] sched/deadline: only mark active cpu as free

On 1/13/2025 2:54 AM, Juri Lelli wrote:
> Hi Doug,
> 
> On 10/01/25 15:30, Doug Berger wrote:
>> There is a hazard in the deadline scheduler where an offlined CPU
>> can have its free_cpus bit left set in the def_root_domain when
>> the schedutil cpufreq governor is used. This can allow a deadline
>> thread to be pushed to the runqueue of a powered down CPU which
>> breaks scheduling.
>>
>> This commit works around the issue by only setting the free_cpus
>> bit for a CPU when it is "active". It is likely that the ordering
>> of sched_set_rq_online() and set_cpu_active() at the end of the
>> sched_cpu_deactivate() function should be revisited if this
>> approach has merit.
>>
>> Signed-off-by: Doug Berger <opendmb@...il.com>
>> ---
<snip>
>> This problem appears to have been initially introduced by commit
>> 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control") which
>> moved the set_rq_offline() handling from sched_cpu_dying() to
>> sched_cpu_deactivate(). The original sequence allowed the free_cpus
>> bit to be forcibly cleared in the def_root_domains after all of the
>> scheduler dust settled. The new location makes the
>> sched_set_rq_offline() essentially meaningless for the deadline
>> scheduler since the managing of changed scheduling domains happens
>> later.
>>
>> There are likely many different approaches to address this issue
>> and I'm hopeful that somone more familiar with the scheduler than
>> I can propose a better solution than the one suggested here.
>>
>> Thank you for reading this far. Any advice is appreciated.
> 
> Thanks a lot for the detailed analysis!
Thanks for the review!

> 
> I actually fear that the issue is due to the cpudl_clear_freecpu() call
> in rq_offline_dl() being racy, as we don't hold cp->lock while calling
> that. So, I think your solution below might be almost correct. I am
> thinking we should do something similar in cpudl_set() and remove cpudl_
> {set,clear}_freecpu() calls altogether.
> 
> What do you think? If agree, care to update your patch please? :)
As I noted, I'm not real keen on what I proposed here. In particular I 
am wary of introducing any extra processing to normal deadline scheduler 
execution paths.

I am about to submit an alternative that only affects paths when the 
topology is changing that I hope you will take another look at ;).

I am still very much open to someone saying something like "You know we 
could probably bring the def_root_domain of the onlining CPU online 
earlier so that set_rq_offline() gets called for it when attaching to 
the new scheduling domain." I just don't feel comfortable enough yet 
with my understanding of the scheduler to know what all of the 
implications of such a change would be.

> 
> Best,
> Juri

Thanks again!
     Doug
> 
>>   kernel/sched/cpudeadline.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/cpudeadline.c b/kernel/sched/cpudeadline.c
>> index 95baa12a1029..6896bbe0e9ae 100644
>> --- a/kernel/sched/cpudeadline.c
>> +++ b/kernel/sched/cpudeadline.c
>> @@ -195,7 +195,8 @@ void cpudl_clear(struct cpudl *cp, int cpu)
>>   		cp->elements[cpu].idx = IDX_INVALID;
>>   		cpudl_heapify(cp, old_idx);
>>   
>> -		cpumask_set_cpu(cpu, cp->free_cpus);
>> +		if (cpu_active(cpu))
>> +			cpumask_set_cpu(cpu, cp->free_cpus);
>>   	}
>>   	raw_spin_unlock_irqrestore(&cp->lock, flags);
>>   }
>> -- 
>> 2.34.1
>>
>