[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f4a95e4b-93f6-4b36-b077-58d1c05bdfa2@intel.com>
Date: Tue, 14 Oct 2025 20:07:10 +0800
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: Adam Li <adamli@...amperecomputing.com>
CC: <dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
<mgorman@...e.de>, <vschneid@...hat.com>, <cl@...ux.com>,
<linux-kernel@...r.kernel.org>, <patches@...erecomputing.com>,
<shkaushik@...erecomputing.com>, <mingo@...hat.com>, <peterz@...radead.org>,
<juri.lelli@...hat.com>, <vincent.guittot@...aro.org>
Subject: Re: [PATCH RESEND] sched/fair: Only update stats for allowed CPUs
when looking for dst group
On 10/14/2025 6:51 PM, Adam Li wrote:
> Hi Chenyu,
>
> Thanks for your comments.
> On 10/12/2025 1:42 AM, Chen, Yu C wrote:
>> On 10/11/2025 2:43 PM, Adam Li wrote:
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index bc0b7ce8a65d..d5ec15050ebc 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -10671,7 +10671,7 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
>>> if (sd->flags & SD_ASYM_CPUCAPACITY)
>>> sgs->group_misfit_task_load = 1;
>>> - for_each_cpu(i, sched_group_span(group)) {
>>> + for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) {
>>
>> Looks good to me. One minor question, would pre-calculating the mask be better?
>
> I do agree pre-calculating the cpumask can save cpu cycles, without
> doing mask AND at each loop.
>
>> Copied from select_idle_cpu():
>>
>> cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
>> cpumask_and(cpus, sched_group_span(sd), p->cpus_ptr);
>> for_each_cpu(i, cpus) {
>>
> But I am not sure if it is safe to use the percpu 'select_rq_mask'
> in update_sg_wakeup_stats(). Or we have to allocate a 'struct cpumask'.
>
Allocating dynamically would be costly. Using percpu select_rq_mask is
safe in this scenario: the waker's CPU has already disabled local irq
via raw_spinlock_irqsave(&p->pi_lock), so I suppose no one can modify
it simultaneously. Moreover, if the fast wakeup path select_idle_sibling()
can use it, the slow path sched_balance_find_dst_cpu() should also be able
to do so IMO.
> I tested bellow patch. It can work and fix the bug.
> If it is safe to use 'select_rq_mask' , I can submit V2 patch.
>
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10664,6 +10664,7 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
> struct task_struct *p)
> {
> int i, nr_running;
> + struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
>
> memset(sgs, 0, sizeof(*sgs));
>
> @@ -10671,7 +10672,8 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
> if (sd->flags & SD_ASYM_CPUCAPACITY)
> sgs->group_misfit_task_load = 1;
>
> - for_each_cpu(i, sched_group_span(group)) {
nice-to-have:
maybe add a comment here that cpus is not empty, because
we have cpumask_intersects() check in sched_balance_find_dst_group(),
(just in case sgs->group_type incorrectly remain 0 which is
group_has_spare, if
the cpus is empty)
> + cpumask_and(cpus, sched_group_span(group), p->cpus_ptr);
> + for_each_cpu(i, cpus) {
> struct rq *rq = cpu_rq(i);
> unsigned int local;
>
>
and from my understanding, for this percpu version,
Reviewed-by: Chen Yu <yu.c.chen@...el.com>
thanks,
Chenyu
Powered by blists - more mailing lists