linux-kernel - Re: Re: [PATCH 1/4] sched/eevdf: Fix vruntime adjustment on reweight

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <38a69b11-d6bb-4f0b-8080-7a051ad58206@bytedance.com>
Date: Fri, 1 Mar 2024 16:30:11 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Tianchen Ding <dtcccc@...ux.alibaba.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Valentin Schneider <valentin.schneider@....com>,
 Barry Song <21cnbao@...il.com>, Benjamin Segall <bsegall@...gle.com>,
 Chen Yu <yu.c.chen@...el.com>, Daniel Jordan <daniel.m.jordan@...cle.com>,
 "Gautham R . Shenoy" <gautham.shenoy@....com>,
 Joel Fernandes <joel@...lfernandes.org>,
 K Prateek Nayak <kprateek.nayak@....com>, Mike Galbraith <efault@....de>,
 Qais Yousef <qyousef@...alina.io>, Tim Chen <tim.c.chen@...ux.intel.com>,
 Yicong Yang <yangyicong@...wei.com>,
 Youssef Esmat <youssefesmat@...omium.org>, linux-kernel@...r.kernel.org
Subject: Re: Re: [PATCH 1/4] sched/eevdf: Fix vruntime adjustment on reweight

On 3/1/24 2:41 PM, Tianchen Ding Wrote:
> On 2024/2/29 22:25, Abel Wu wrote:
>> Good catch. And to the best of my knowledge, the answer is YES. The
>> above Equation in the paper, which is Eq. (20), is based on the
>> assumption that:
>>
>>      "once client 3 leaves, the remaining two clients will
>>       proportionally support the eventual loss or gain in the
>>       service time"  -- Page 10
>>
>>      "by updating the virtual time according to Eq. (18,19) we
>>       ensure that the sum over the lags of all active clients
>>       is always zero"  -- Page 11
>>
>> But in Peter's implementation, it is the competitors in the new group
>> that client 3 later joins in who actually support the effect. So when
>> client 3 leaves competition with !0-lag in Linux, the rq's sum(lag_i)
>> is no longer zero.
>>
> 
> I've different opinions. According to the comments above avg_vruntime_add(), V
> is calculated exactly to satisfy sum(lag_i)=0. This is guaranteed by math.

Yes, you are right. I mixed another fairness issue with this. What I
was thinking is that considering multiple competition groups (e.g.
runqueues), the latency bound could be violated, that is someone could
starve a bit. Say one entity even with positive lag could become less
competitive if migrated to a higher competitive group.

Staring at Eq. (20) again, what if we do a fake reweight? I mean let
the client leave and rejoin at the same time without changing weight?
IMHO it should have no effects, but according to Eq. (20) the V will
change to:

	V' = V + lag(j)/(W - w_j) - lag(j)/W != V

Have I missed anything?

> 
> Actually I print some logs in enqueue_entity() and dequeue_entity() to verify this:
> 
> [  293.261236] before dequeue: V=2525278131 W=3072 v=2526243139 w=1024 lag_sum=0
> [  293.261237] after dequeue: V=2524795627 W=2048 v=2526243139 w=1024 lag_sum=0
> [  293.262286] before enqueue: V=2525319064 W=2048 v=2526766576 w=1024 lag_sum=0
> [  293.262287] after enqueue: V=2525801568 W=3072 v=2526766576 w=1024 lag_sum=0
> 
> For the first 2 lines, we have 2524795627 = 2525278131 + (2525278131 - 2526243139) * 1024 / 2048.
> Which is Eq. (18)
> 
> For the last 2 lines, we have 2525801568 = 2525319064 - (2525319064 - 2526766576) * 1024 / 3072.
> Which is Eq. (19)
> 
> So whatever client 3 leave or join competition with !0-lag in Linux, V is handled properly.
> 
>> Good catch again! It smells like a bug. Since this @se is still on_rq,
>> it should be taken into consideration when calculating avg_runtime(),
>> but in fact it isn't because __dequeue_entity() will remove its share.
>>
>> And I seem to spot another bug, although not relate to this problem,
>> that we actually need to call update_curr() unconditionally if curr is
>> available, because we need to commit curr's outstanding runtime to
>> ensure the result of avg_runtime() is up to date.
>>
> 
> I've tried to record avg_vruntime before __dequeue_entity() and pass it to
> reweight_eevdf(). Then the issue is fixed. The V keeps the same during the whole
> reweight_entity().
> 
> I could send these two bugfix patches (one for this bug and one you sugguested

That would be appreciated!

Thanks,
	Abel