lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <64d963b6-2d9c-3f93-d427-a1ff705fb65a@bytedance.com>
Date:   Fri, 28 Oct 2022 11:57:43 +0800
From:   Chuyi Zhou <zhouchuyi@...edance.com>
To:     Josh Don <joshdon@...gle.com>
Cc:     peterz@...radead.org, juri.lelli@...hat.com, mingo@...hat.com,
        vincent.guittot@...aro.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: favor non-idle group in tick preemption



在 2022/10/28 07:34, Josh Don 写道:
> Hi Chuyi,
> 
> On Thu, Oct 27, 2022 at 1:16 AM Chuyi Zhou <zhouchuyi@...edance.com> wrote:
>>
>> The non-idle se dominates competition vs the idle se when they
>> are belong to the same group. We ensure that idle groups would not
>> preempt non-idle group in wakeup preemption(see check_preempt_wakeup()).
>> However, this can happen in tick preemption, since check_preempt_tick()
>> dose not check current/se is idle or not. This patch adds this check.
>>
>> Signed-off-by: Chuyi Zhou <zhouchuyi@...edance.com>
>> ---
>>   kernel/sched/fair.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index e4a0b8bd941c..f3324b8753b3 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4750,6 +4750,7 @@ static void
>>   check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>>   {
>>          unsigned long ideal_runtime, delta_exec;
>> +       int cse_is_idle, pse_is_idle;
>>          struct sched_entity *se;
>>          s64 delta;
>>
>> @@ -4779,8 +4780,17 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>>          if (delta < 0)
>>                  return;
>>
>> -       if (delta > ideal_runtime)
>> +       if (delta > ideal_runtime) {
>> +               /*
>> +                * Favor non-idle group even in tick preemption
>> +                */
>> +               cse_is_idle = se_is_idle(curr);
>> +               pse_is_idle = se_is_idle(se);
>> +               if (unlikely(!cse_is_idle && pse_is_idle))
>> +                       return;
> 
Hi Josh, thanks for your reply,
> This would make it so that we never have tick based preemption of a
> non-idle entity by an idle entity. That's a recipe for starvation of
> the idle entity, if the non-idle entity is cpu bound.
> 
> Beyond that though, I'm not quite sure the issue being solved here.
> The large difference in weight between the idle and non-idle entity
> means that the non-idle entity will not be preempted for quite a while
> due to its ideal_runtime being quite high. The idle entity will
> quickly be preempted on the next tick it takes due to the smaller
> value of sysctl_sched_idle_min_granularity.
> 
Actually, I did some tests and traced this issue. the result shows that 
this can happen with small probability. I also do some benchmarks. The 
result seems it has no performance harm, and we can reduce 2%~3% 
context switch when idle group & non-idle group are present at the same 
time. In addition, for throughput concern, I think we better let 
non-idle entity consume it's ideal_runtime. However, as you said, it may 
cause starvation of the idle entity.....

There is another question I would like to take this opportunity to 
consult you. In our production environment, in some cases, we want to 
adjust the weight/shares of the idle-cgroup which means we need to 
change the logic of sched_group_set_shares(), and it can increase the 
probability of the above issue. So, for what reasons did you prohibit 
users from changing weights of idle cgroup.

Thanks again for your review.

Best,
Chuyi
> The wakeup check is useful for latency sensitive non-idle tasks.
> However, in steady state competition between idle and non-idle, we
> must allow some amount of round-robin.
> 
>> +
>>                  resched_curr(rq_of(cfs_rq));
>> +       }
>>   }
>>
>>   static void
>> --
>> 2.20.1
>>
> 
> Best,
> Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ