[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa06b988-e2bd-4ff9-caab-def399698bf2@linux.alibaba.com>
Date: Thu, 24 Sep 2020 16:54:42 +0800
From: Xunlei Pang <xlpang@...ux.alibaba.com>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Xunlei Pang <xlpang@...ux.alibaba.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Jiang Biao <benbjiang@...cent.com>,
Wetp Zhang <wetp.zy@...ux.alibaba.com>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RESEND] sched/fair: Fix wrong cpu selecting from isolated
domain
On 9/24/20 3:18 PM, Vincent Guittot wrote:
> On Thu, 24 Sep 2020 at 08:48, Xunlei Pang <xlpang@...ux.alibaba.com> wrote:
>>
>> We've met problems that occasionally tasks with full cpumask
>> (e.g. by putting it into a cpuset or setting to full affinity)
>> were migrated to our isolated cpus in production environment.
>>
>> After some analysis, we found that it is due to the current
>> select_idle_smt() not considering the sched_domain mask.
>>
>> Steps to reproduce on my 31-CPU hyperthreads machine:
>> 1. with boot parameter: "isolcpus=domain,2-31"
>> (thread lists: 0,16 and 1,17)
>> 2. cgcreate -g cpu:test; cgexec -g cpu:test "test_threads"
>> 3. some threads will be migrated to the isolated cpu16~17.
>>
>> Fix it by checking the valid domain mask in select_idle_smt().
>>
>> Fixes: 10e2f1acd010 ("sched/core: Rewrite and improve select_idle_siblings())
>> Reported-by: Wetp Zhang <wetp.zy@...ux.alibaba.com>
>> Reviewed-by: Jiang Biao <benbjiang@...cent.com>
>> Signed-off-by: Xunlei Pang <xlpang@...ux.alibaba.com>
>
> Reviewed-by: Vincent Guittot <vincent.guittot@...aro.org>
>
Thanks, Vincent :-)
Powered by blists - more mailing lists