[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250722114654.2620626-1-wangtao554@huawei.com>
Date: Tue, 22 Jul 2025 11:46:54 +0000
From: Wang Tao <wangtao554@...wei.com>
To: <graf@...zon.com>
CC: <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <mingo@...hat.com>,
<nh-open-source@...zon.com>, <peterz@...radead.org>, <sieberf@...zon.com>,
<vincent.guittot@...aro.org>, <tanghui20@...wei.com>
Subject: [PATCH] Re: [PATCH] sched/fair: Only increment deadline once on yield
>> On 01/04/25 18:06, Fernand Sieber wrote:
>> If a task yields, the scheduler may decide to pick it again. The task in
>> turn may decide to yield immediately or shortly after, leading to a tight
>> loop of yields.
>>
>> If there's another runnable task as this point, the deadline will be
>> increased by the slice at each loop. This can cause the deadline to runaway
>> pretty quickly, and subsequent elevated run delays later on as the task
>> doesn't get picked again. The reason the scheduler can pick the same task
>> again and again despite its deadline increasing is because it may be the
>> only eligible task at that point.
>>
>> Fix this by updating the deadline only to one slice ahead.
>>
>> Note, we might want to consider iterating on the implementation of yield as
>> follow up:
>> * the yielding task could be forfeiting its remaining slice by
>> incrementing its vruntime correspondingly
>> * in case of yield_to the yielding task could be donating its remaining
>> slice to the target task
>>
>> Signed-off-by: Fernand Sieber <sieberf@...zon.com>
>IMHO it's worth noting that this is not a theoretical issue. We have
>seen this in real life: A KVM virtual machine's vCPU which runs into a
>busy guest spin lock calls kvm_vcpu_yield_to() which eventually ends up
>in the yield_task_fair() function. We have seen such spin locks due to
>guest contention rather than host overcommit, which means we go into a
>loop of vCPU execution and spin loop exit, which results in an
>undesirable increase in the vCPU thread's deadline.
>Given this impacts real workloads and is a bug present since the
>introduction of EEVDF, I would say it warrants a
>Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling
>policy")
>tag.
>Alex
Actually, as Alex described, we encountered the same issue in this
testing scenario: starting qemu, binding cores to the cpuset group,
setting cpuset.cpus=1-3 for stress testing in qemu,
running taskset -c 1-3 ./stress-ng -c 20, and then encountering an error where qemu freezes,
reporting a soft lockup issue in qemu. After applying this patch, the problem was resolved.
Do we have plans to merge this patch into the mainline?
Powered by blists - more mailing lists