[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8735nkcwov.mognet@arm.com>
Date: Thu, 25 Nov 2021 11:16:16 +0000
From: Valentin Schneider <Valentin.Schneider@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Vincent Donnefort <Vincent.Donnefort@....com>
Cc: peterz@...radead.org, mingo@...hat.com,
linux-kernel@...r.kernel.org, mgorman@...hsingularity.net,
dietmar.eggemann@....com
Subject: Re: [PATCH] sched/fair: Fix detection of per-CPU kthreads waking a task
On 25/11/21 10:05, Vincent Guittot wrote:
> On Wed, 24 Nov 2021 at 16:42, Vincent Donnefort
> <vincent.donnefort@....com> wrote:
>>
>> select_idle_sibling() will return prev_cpu for the case where the task is
>> woken up by a per-CPU kthread. However, the idle task has been recently
>> modified and is now identified by is_per_cpu_kthread(), breaking the
>> behaviour described above. Using !is_idle_task() ensures we do not
>> spuriously trigger that select_idle_sibling() exit path.
>>
>> Fixes: 00b89fe0197f ("sched: Make the idle task quack like a per-CPU kthread")
>> Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 945d987246c5..8bf95b0e368d 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6399,6 +6399,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
>> * pattern is IO completions.
>> */
>> if (is_per_cpu_kthread(current) &&
>> + !is_idle_task(current) &&
>> prev == smp_processor_id() &&
>> this_rq()->nr_running <= 1) {
>> return prev;
>
> AFAICT, this can't be possible for a symmetric system because it would
> have been already returned by other conditions.
> Only an asymmetric system can face such a situation if the task
> doesn't fit which is the subject of your other patch.
> so this patch seems irrelevant outside the other one
>
I think you can still hit this on a symmetric system; let me try to
reformulate my other email.
If this (non-patched) condition evaluates to true, it means the previous
condition
(available_idle_cpu(target) || sched_idle_cpu(target)) &&
asym_fits_capacity(task_util, target)
evaluated to false, so for a symmetric system target sure isn't idle.
prev == smp_processor_id() implies prev == target, IOW prev isn't
idle. Now, consider:
p0.prev = CPU1
p1.prev = CPU1
CPU0 CPU1
current = don't care current = swapper/1
ttwu(p1)
ttwu_queue(p1, CPU1)
// or
ttwu_queue_wakelist(p1, CPU1)
hrtimer_wakeup()
wake_up_process()
ttwu()
idle_cpu(CPU1)? no
is_per_cpu_kthread(current)? yes
prev == smp_processor_id()? yes
this_rq()->nr_running <= 1? yes
=> self enqueue
...
schedule_idle()
This works if CPU0 does either a full enqueue (rq->nr_running == 1) or just
a wakelist enqueue (rq->ttwu_pending > 0). If there was an idle CPU3
around, we'd still be stacking p0 and p1 onto CPU1.
IOW this opens a window between a remote ttwu() and the idle task invoking
schedule_idle() where the idle task can stack more tasks onto its CPU.
>
>> --
>> 2.25.1
>>
Powered by blists - more mailing lists