[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <88998fa9-b454-45c1-a8e2-164d2e2d94c0@gmail.com>
Date: Tue, 23 Sep 2025 11:03:18 -0700
From: Doug Berger <opendmb@...il.com>
To: Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org,
Florian Fainelli <florian.fainelli@...adcom.com>
Subject: Re: [PATCH v2] sched/deadline: only set free_cpus for online
runqueues
On 9/4/2025 9:43 PM, Doug Berger wrote:
> On 9/3/2025 12:54 AM, Peter Zijlstra wrote:
>> On Mon, Aug 18, 2025 at 02:58:19PM +0200, Juri Lelli wrote:
>>> Hello,
>>>
>>> On 14/08/25 18:22, Doug Berger wrote:
>>>> Commit 16b269436b72 ("sched/deadline: Modify cpudl::free_cpus
>>>> to reflect rd->online") introduced the cpudl_set/clear_freecpu
>>>> functions to allow the cpu_dl::free_cpus mask to be manipulated
>>>> by the deadline scheduler class rq_on/offline callbacks so the
>>>> mask would also reflect this state.
>>>>
>>>> Commit 9659e1eeee28 ("sched/deadline: Remove cpu_active_mask
>>>> from cpudl_find()") removed the check of the cpu_active_mask to
>>>> save some processing on the premise that the cpudl::free_cpus
>>>> mask already reflected the runqueue online state.
>>>>
>>>> Unfortunately, there are cases where it is possible for the
>>>> cpudl_clear function to set the free_cpus bit for a CPU when the
>>>> deadline runqueue is offline. When this occurs while a CPU is
>>>> connected to the default root domain the flag may retain the bad
>>>> state after the CPU has been unplugged. Later, a different CPU
>>>> that is transitioning through the default root domain may push a
>>>> deadline task to the powered down CPU when cpudl_find sees its
>>>> free_cpus bit is set. If this happens the task will not have the
>>>> opportunity to run.
>>>>
>>>> One example is outlined here:
>>>> https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com
>>>>
>>>> Another occurs when the last deadline task is migrated from a
>>>> CPU that has an offlined runqueue. The dequeue_task member of
>>>> the deadline scheduler class will eventually call cpudl_clear
>>>> and set the free_cpus bit for the CPU.
>>>>
>>>> This commit modifies the cpudl_clear function to be aware of the
>>>> online state of the deadline runqueue so that the free_cpus mask
>>>> can be updated appropriately.
>>>>
>>>> It is no longer necessary to manage the mask outside of the
>>>> cpudl_set/clear functions so the cpudl_set/clear_freecpu
>>>> functions are removed. In addition, since the free_cpus mask is
>>>> now only updated under the cpudl lock the code was changed to
>>>> use the non-atomic __cpumask functions.
>>>>
>>>> Signed-off-by: Doug Berger <opendmb@...il.com>
>>>> ---
>>>
>>> This looks now good to me.
>>>
>>> Acked-by: Juri Lelli <juri.lelli@...hat.com>
>>
>> So I just had a look at said patch because Juri here poked me; and I
>> came away with the feeling that cpudl_clear() is now a misnomen, seeing
>> how it is called from rq_online_dl().
>>
>> Would cpudl_update() be a better name?
> Thanks for taking a look.
>
> I don't really have a dog in any fights over naming here, but it seems
> to me that cpudl_clear and cpudl_set are intended to be complementary
> functions and the naming reflects that. It would appear that these are
> primarily intended to maintain the cpudl max-heap entries which is what
> are being set and cleared.
>
> rq_online_dl() would now call one or the other based on whether any
> deadline tasks are running on the queue when it is onlined to ensure
> that the max-heap is valid. This either clears a stale entry that may
> occur from scenarios like the ones I'm running into or set the entry to
> the current deadline. In this context the names seem appropriate to me.
>
> Renaming cpudl_clear to cpudl_update may be more confusing since the
> comment for cpudl_set reads "cpudl_set - update the cpudl max-heap".
>
> I don't feel that the name change is relevant to my patch, but if we
> want to do it concurrently maybe cpudl_clear_max_heap() and
> cpudl_set_max_heap() would be more meaningful.
>
> Please let me know how you would like to proceed,
> Doug
Is there any way I can help to move this patch forward?
Thanks,
Doug
Powered by blists - more mailing lists