linux-kernel - Re: [RFC PATCH] sched/fair: update the vruntime to be max vruntime when yield

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y/zO8WZV2kvcU78b@hirez.programming.kicks-ass.net>
Date:   Mon, 27 Feb 2023 16:40:33 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Xuewen Yan <xuewen.yan@...soc.com>
Cc:     vincent.guittot@...aro.org, mingo@...hat.com,
        juri.lelli@...hat.com, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com, qyousef@...alina.io,
        linux-kernel@...r.kernel.org, ke.wang@...soc.com,
        zhaoyang.huang@...soc.com
Subject: Re: [RFC PATCH] sched/fair: update the vruntime to be max vruntime
 when yield

On Wed, Feb 22, 2023 at 04:03:14PM +0800, Xuewen Yan wrote:
> When task call the sched_yield, cfs would set the cfs's skip buddy.
> If there is no other task call the sched_yield syscall, the task would
> always be skiped when there are tasks in rq. 

So you have two tasks A) which does sched_yield() and becomes ->skip,
and B) which is while(1). And you're saying that once A does it's thing,
B runs forever and starves A?

> As a result, the task's
> vruntime would not be updated for long time, and the cfs's min_vruntime
> is almost not updated.

But the condition in pick_next_entity() should ensure that we still pick
->skip when it becomes too old. Specifically, when it gets more than
wakeup_gran() behind.

> When this scenario happens, when the yield task had wait for a long time,
> and other tasks run a long time, once there is other task call the sched_yield,
> the cfs's skip_buddy is covered, at this time, the first task can run normally,
> but the task's vruntime is small, as a result, the task would always run,
> because other task's vruntime is big. This would lead to other tasks can not
> run for a long time.

I'm not seeing how this could happen, it should never get behind that
far.

Additionally, check_preempt_tick() will explicitly clear the buddies
when it finds the current task has consumed it's ideal slice.

I really cannot see how your scenario can happen.

> In order to mitigate this, update the yield_task's vruntime to be cfs's max vruntime.
> This way, the cfs's min vruntime can be updated as the process running.

This is a bad solution, SCHED_IDLE tasks have very low weight and can be
shot really far to the right, leading to other trouble.