linux-kernel - Re: [External] Re: Subject: [PATCH] sched/fair: prioritize normal task over sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e57c7166-b484-0d32-e4e8-5a47ef0bb53c@bytedance.com>
Date:   Sun, 13 Mar 2022 18:06:58 +0800
From:   chenying <chenying.kernel@...edance.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, mgorman@...e.de, bristot@...hat.com,
        bsegall@...gle.com, linux-kernel@...r.kernel.org,
        duanxiongchun@...edance.com, zhouchengming@...edance.com,
        songmuchun@...edance.com, zhengqi.arch@...edance.com,
        zhoufeng.zf@...edance.com, ligang.bdlg@...edance.com
Subject: Re: [External] Re: Subject: [PATCH] sched/fair: prioritize normal
 task over sched_idle task with vruntime offset

在 2022/3/13 17:02, Peter Zijlstra 写道:
> On Sun, Mar 13, 2022 at 01:37:37PM +0800, chenying wrote:
>> 在 2022/3/12 20:03, Peter Zijlstra 写道:
>>> On Fri, Mar 11, 2022 at 03:58:47PM +0800, chenying wrote:
>>>> We add a time offset to the se->vruntime when the idle sched_entity
>>>> is enqueued, so that the idle entity will always be on the right of
>>>> the non-idle in the runqueue. This can allow non-idle tasks to be
>>>> selected and run before the idle.
>>>>
>>>> A use-case is that sched_idle for background tasks and non-idle
>>>> for foreground. The foreground tasks are latency sensitive and do
>>>> not want to be disturbed by the background. It is well known that
>>>> the idle tasks can be preempted by the non-idle tasks when waking up,
>>>> but will not distinguish between idle and non-idle when pick the next
>>>> entity. This may cause background tasks to disturb the foreground.
>>>>
>>>> Test results as below:
>>>>
>>>> ~$ ./loop.sh &
>>>> [1] 764
>>>> ~$ chrt -i 0 ./loop.sh &
>>>> [2] 765
>>>> ~$ taskset -p 04 764
>>>> ~$ taskset -p 04 765
>>>>
>>>> ~$ top -p 764 -p 765
>>>> top - 13:10:01 up 1 min,  2 users,  load average: 1.30, 0.38, 0.13
>>>> Tasks:   2 total,   2 running,   0 sleeping,   0 stopped,   0 zombie
>>>> %Cpu(s): 12.5 us,  0.0 sy,  0.0 ni, 87.4 id,  0.0 wa,  0.0 hi, 0.0 si,  0.0
>>>> st
>>>> KiB Mem : 16393492 total, 16142256 free,   111028 used,   140208 buff/cache
>>>> KiB Swap:   385836 total,   385836 free,        0 used. 16037992 avail Mem
>>>>
>>>>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ COMMAND
>>>>     764 chenyin+  20   0   12888   1144   1004 R 100.0  0.0 1:05.12 loop.sh
>>>>     765 chenyin+  20   0   12888   1224   1080 R   0.0  0.0 0:16.21 loop.sh
>>>>
>>>> The non-idle process (764) can run at 100% and without being disturbed by
>>>> the idle process (765).
>>>
>>> Did you just do a very complicated true idle time scheduler, with all
>>> the problems that brings?
>>
>> When colocating CPU-intensive jobs with latency-sensitive services can
>> improve CPU utilization but it is difficult to meet the stringent
>> tail-latency requirements of latency-sensitive services. We use a true idle
>> time scheduler for CPU-intensive jobs to minimize the impact on
>> latency-sensitive services.
> 
> Hard NAK on any true idle-time scheduler until you make the whole kernel
> immune to lock holder starvation issues.

If I set the sched_idle_vruntime_offset to a relatively small value 
(e.g. 10 minutes), can this issues be avoided?