[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06613204-b279-4f66-a786-e5e26bccd42e@bytedance.com>
Date: Mon, 20 Nov 2023 20:06:08 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Tobias Huschle <huschle@...ux.ibm.com>,
Linux Kernel <linux-kernel@...r.kernel.org>, kvm@...r.kernel.org,
virtualization@...ts.linux.dev, netdev@...r.kernel.org, mst@...hat.com,
jasowang@...hat.com
Subject: Re: Re: Re: EEVDF/vhost regression (bisected to 86bfbb7ce4f6
sched/fair: Add lag based placement)
On 11/20/23 6:56 PM, Peter Zijlstra Wrote:
> On Sat, Nov 18, 2023 at 01:14:32PM +0800, Abel Wu wrote:
>
>> Hi Peter, I'm a little confused here. As we adopt placement strategy #1
>> when PLACE_LAG is enabled, the lag of that entity needs to be preserved.
>> Given that the weight doesn't change, we have:
>>
>> vl' = vl
>>
>> But in fact it is scaled on placement:
>>
>> vl' = vl * W/(W + w)
>
> (W+w)/W
Ah, right. I misunderstood (again) the comment which says:
vl_i = (W + w_i)*vl'_i / W
So the current implementation is:
v' = V - vl'
and what I was proposing is:
v' = V' - vl
and they are equal in fact.
>
>>
>> Does this intended?
>
> The scaling, yes that's intended and the comment explains why. So now
> you have me confused too :-)
>
> Specifically, I want the lag after placement to be equal to the lag we
> come in with. Since placement will affect avg_vruntime (adding one
> element to the average changes the average etc..) the placement also
> affects the lag as measured after placement.
Yes. You did the math in an iterative fashion and mine is facing the
final state:
v' = V' - vlag
V' = (WV + wv') / (W + w)
which gives:
V' = V - w * vlag / W
>
> Or rather, if you enqueue and dequeue, I want the lag to be preserved.
> If you do not take placement into consideration, lag will dissipate real
> quick.
>
>> And to illustrate my understanding of strategy #1:
>
>> @@ -5162,41 +5165,17 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>> * vl_i is given by:
>> *
>> * V' = (\Sum w_j*v_j + w_i*v_i) / (W + w_i)
>> - * = (W*V + w_i*(V - vl_i)) / (W + w_i)
>> - * = (W*V + w_i*V - w_i*vl_i) / (W + w_i)
>> - * = (V*(W + w_i) - w_i*l) / (W + w_i)
>> - * = V - w_i*vl_i / (W + w_i)
>> - *
>> - * And the actual lag after adding an entity with vl_i is:
>> - *
>> - * vl'_i = V' - v_i
>> - * = V - w_i*vl_i / (W + w_i) - (V - vl_i)
>> - * = vl_i - w_i*vl_i / (W + w_i)
>> - *
>> - * Which is strictly less than vl_i. So in order to preserve lag
>> - * we should inflate the lag before placement such that the
>> - * effective lag after placement comes out right.
>> - *
>> - * As such, invert the above relation for vl'_i to get the vl_i
>> - * we need to use such that the lag after placement is the lag
>> - * we computed before dequeue.
>> + * = (W*V + w_i*(V' - vl_i)) / (W + w_i)
>> + * = V - w_i*vl_i / W
>> *
>> - * vl'_i = vl_i - w_i*vl_i / (W + w_i)
>> - * = ((W + w_i)*vl_i - w_i*vl_i) / (W + w_i)
>> - *
>> - * (W + w_i)*vl'_i = (W + w_i)*vl_i - w_i*vl_i
>> - * = W*vl_i
>> - *
>> - * vl_i = (W + w_i)*vl'_i / W
>> */
>> load = cfs_rq->avg_load;
>> if (curr && curr->on_rq)
>> load += scale_load_down(curr->load.weight);
>> -
>> - lag *= load + scale_load_down(se->load.weight);
>> if (WARN_ON_ONCE(!load))
>> load = 1;
>> - lag = div_s64(lag, load);
>> +
>> + vruntime -= div_s64(lag * scale_load_down(se->load.weight), load);
>> }
>> se->vruntime = vruntime - lag;
>
>
> So you're proposing we do:
>
> v = V - (lag * w) / (W + w) - lag
What I 'm proposing is:
V' = V - w * vlag / W
so we have:
v' = V' - vlag
= V - vlag * w/W - vlag
= V - vlag * (W + w)/W
which is exactly the same as current implementation.
>
> ?
>
> That can be written like:
>
> v = V - (lag * w) / (W+w) - (lag * (W+w)) / (W+w)
> = V - (lag * (W+w) + lag * w) / (W+w)
> = V - (lag * (W+2w)) / (W+w)
>
> And that turns into a mess AFAICT.
>
>
> Let me repeat my earlier argument. Suppose v,w,l are the new element.
> V,W are the old avg_vruntime and sum-weight.
>
> Then: V = V*W / W, and by extention: V' = (V*W + v*w) / (W + w).
>
> The new lag, after placement:
>
> l' = V' - v = (V*W + v*w) / (W+w) - v
> = (V*W + v*w) / (W+w) - v * (W+w) / (W+v)
> = (V*W + v*w -v*W - v*w) / (W+w)
> = (V*W - v*W) / (W+w)
> = W*(V-v) / (W+w)
> = W/(W+w) * (V-v)
>
> Substitute: v = V - (W+w)/W * l, my scaling thing, to obtain:
>
> l' = W/(W+w) * (V - (V - (W+w)/W * l))
> = W/(W+w) * (V - V + (W+w)/W * l)
> = W/(W+w) * (W+w)/W * l
> = l
>
> So by scaling, we've preserved lag across placement.
>
> That make sense?
Yes, I think I won't misunderstand again for the 3rd time :)
Thanks!
Abel
Powered by blists - more mailing lists