[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E4AA746E-6608-4576-BF19-57589B2867FE@linux.alibaba.com>
Date: Fri, 21 May 2021 20:42:27 +0800
From: changhuaixin <changhuaixin@...ux.alibaba.com>
To: Odin Ugedal <odin@...d.al>
Cc: changhuaixin <changhuaixin@...ux.alibaba.com>,
Benjamin Segall <bsegall@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
dtcccc@...ux.alibaba.com, Juri Lelli <juri.lelli@...hat.com>,
khlebnikov@...dex-team.ru,
open list <linux-kernel@...r.kernel.org>,
Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...hat.com>,
pauld@...head.com, Peter Zijlstra <peterz@...radead.org>,
Paul Turner <pjt@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>,
shanpeic@...ux.alibaba.com, Tejun Heo <tj@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
xiyou.wangcong@...il.com
Subject: Re: [PATCH v5 2/3] sched/fair: Add cfs bandwidth burst statistics
> On May 20, 2021, at 10:11 PM, Odin Ugedal <odin@...d.al> wrote:
>
> I am a bit sceptical about both the nr_burst and burst_time as they are now.
>
> As an example; a control group using "99.9%" of the quota each period
> and that is never throttled. Such group would with this patch with a burst of X
> still get nr_throttled = 0 (as before), but it would get a nr_burst
> and burst_time that
> will keep increasing.
>
Agreed, there are false positive and false negetive cases, as the current implementation
uses cfs_b->runtime to judge instead of the actual runtime used.
> I think there is a big difference between runtime moved/taken from
> cfs_b->runtime to cfs_rq->runtime_remaining and the actual runtime used
> in the period. Currently, cfs bw can only supply info the first one, and
> not the latter.
>
> I think that if people see nr_burst increasing, that they think they _have_
> to use cfs burst in order to avoid being throttled, even though that might
> not be the case. It is probably fine as is, as long as it is explicitly stated
It can't be seeing nr_burst incresing first, and using cfs burst feature afterwards.
Do you mean people see nr_throttled increasing and use cfs burst, while the actual usage
is below quota? In that case, tasks get throttled because there are runtime to be returned from
cfs_rq, and get unthrottled shortly. That is a false positive for nr_throttled. When users see that,
using burst can help improve.
> what the values mean and imply, and what they do not. I cannot see another
> way to calculate it as it is now, but maybe someone else has some thoughts.
>
> Thanks
> Odin
Powered by blists - more mailing lists