[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e57c7166-b484-0d32-e4e8-5a47ef0bb53c@bytedance.com>
Date: Sun, 13 Mar 2022 18:06:58 +0800
From: chenying <chenying.kernel@...edance.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, mgorman@...e.de, bristot@...hat.com,
bsegall@...gle.com, linux-kernel@...r.kernel.org,
duanxiongchun@...edance.com, zhouchengming@...edance.com,
songmuchun@...edance.com, zhengqi.arch@...edance.com,
zhoufeng.zf@...edance.com, ligang.bdlg@...edance.com
Subject: Re: [External] Re: Subject: [PATCH] sched/fair: prioritize normal
task over sched_idle task with vruntime offset
在 2022/3/13 17:02, Peter Zijlstra 写道:
> On Sun, Mar 13, 2022 at 01:37:37PM +0800, chenying wrote:
>> 在 2022/3/12 20:03, Peter Zijlstra 写道:
>>> On Fri, Mar 11, 2022 at 03:58:47PM +0800, chenying wrote:
>>>> We add a time offset to the se->vruntime when the idle sched_entity
>>>> is enqueued, so that the idle entity will always be on the right of
>>>> the non-idle in the runqueue. This can allow non-idle tasks to be
>>>> selected and run before the idle.
>>>>
>>>> A use-case is that sched_idle for background tasks and non-idle
>>>> for foreground. The foreground tasks are latency sensitive and do
>>>> not want to be disturbed by the background. It is well known that
>>>> the idle tasks can be preempted by the non-idle tasks when waking up,
>>>> but will not distinguish between idle and non-idle when pick the next
>>>> entity. This may cause background tasks to disturb the foreground.
>>>>
>>>> Test results as below:
>>>>
>>>> ~$ ./loop.sh &
>>>> [1] 764
>>>> ~$ chrt -i 0 ./loop.sh &
>>>> [2] 765
>>>> ~$ taskset -p 04 764
>>>> ~$ taskset -p 04 765
>>>>
>>>> ~$ top -p 764 -p 765
>>>> top - 13:10:01 up 1 min, 2 users, load average: 1.30, 0.38, 0.13
>>>> Tasks: 2 total, 2 running, 0 sleeping, 0 stopped, 0 zombie
>>>> %Cpu(s): 12.5 us, 0.0 sy, 0.0 ni, 87.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
>>>> st
>>>> KiB Mem : 16393492 total, 16142256 free, 111028 used, 140208 buff/cache
>>>> KiB Swap: 385836 total, 385836 free, 0 used. 16037992 avail Mem
>>>>
>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>>>> 764 chenyin+ 20 0 12888 1144 1004 R 100.0 0.0 1:05.12 loop.sh
>>>> 765 chenyin+ 20 0 12888 1224 1080 R 0.0 0.0 0:16.21 loop.sh
>>>>
>>>> The non-idle process (764) can run at 100% and without being disturbed by
>>>> the idle process (765).
>>>
>>> Did you just do a very complicated true idle time scheduler, with all
>>> the problems that brings?
>>
>> When colocating CPU-intensive jobs with latency-sensitive services can
>> improve CPU utilization but it is difficult to meet the stringent
>> tail-latency requirements of latency-sensitive services. We use a true idle
>> time scheduler for CPU-intensive jobs to minimize the impact on
>> latency-sensitive services.
>
> Hard NAK on any true idle-time scheduler until you make the whole kernel
> immune to lock holder starvation issues.
If I set the sched_idle_vruntime_offset to a relatively small value
(e.g. 10 minutes), can this issues be avoided?
Powered by blists - more mailing lists