[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6bbd4f4d-d64c-0c9d-90d5-9122e6b21835@bytedance.com>
Date: Tue, 20 Sep 2022 15:42:51 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Peter Zijlstra <peterz@...radead.org>, Mel Gorman <mgorman@...e.de>
Cc: Vincent Guittot <vincent.guittot@...aro.org>,
Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Fix misuse of available_idle_cpu()
Ping :)
On 9/8/22 9:49 PM, Abel Wu wrote:
> On 9/8/22 9:17 PM, Peter Zijlstra wrote:
>> On Thu, Sep 08, 2022 at 11:36:32AM +0100, Mel Gorman wrote:
>>> On Thu, Sep 08, 2022 at 04:07:02PM +0800, Abel Wu wrote:
>>>> The function available_idle_cpu() was introduced to distinguish
>>>> between the code paths that cares if the vCPU is preempted and
>>>> the ones don't care. In general, available_idle_cpu() is used in
>>>> selecting cpus for immediate use, e.g. ttwu. While idle_cpu() is
>>>> used in the paths that only cares about the cpu is idle or not,
>>>> and __update_idle_core() is one of them.
>>>>
>>>> Use idle_cpu() instead in the idle path to make has_idle_core
>>>> a better hint.
>>>>
>>>> Fixes: 943d355d7fee (sched/core: Distinguish between idle_cpu()
>>>> calls based on desired effect, introduce available_idle_cpu())
>>>> Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
>>>
>>> Seems fair. As vCPU preemption is specific to virtualisation, it is very
>>> unlikely that SMT is exposed to the guest so the impact of the patch is
>>
>> Right; only pinned guests typically expose such topology information
>> (anything else would be quite broken).
Yes, and it is common in our ECS servers to use pinned guests.
>>
>>> minimal but I still think it's right so;
>>
>> I'm not convinced; all of select_idle_sibling() seems to use
>> available_idle_cpu(), and that's the only consumer of
>> __update_idle_core(), so in that respect the current state makes sense.
>
> Hi Peter, Mel, thanks for your reviewing!
>
> My thought was that the preempted core can become active again before
> select_idle_sibling() is called, so using available_idle_cpu() in
> __update_idle_core() can potentially lose the opportunity to kick an
> idle core running. While the downside of using idle_cpu() is that a
> full scan can be triggered irrespective of non-preempted cores exist,
> but even available_idle_cpu() can not make sure of that either.
>
> BTW, I am also confused with select_idle_core() in which all the cpus
> of a core need to be non-preempted before the core can be taken as an
> idle core. IMHO, it might be enough that at least one cpu of an idle
> core is non-preempted and allowed by task's taskset.
Powered by blists - more mailing lists