linux-kernel - Re: [PATCH v5 2/3] sched/fair: Add cfs bandwidth burst statistics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <E4AA746E-6608-4576-BF19-57589B2867FE@linux.alibaba.com>
Date:   Fri, 21 May 2021 20:42:27 +0800
From:   changhuaixin <changhuaixin@...ux.alibaba.com>
To:     Odin Ugedal <odin@...d.al>
Cc:     changhuaixin <changhuaixin@...ux.alibaba.com>,
        Benjamin Segall <bsegall@...gle.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        dtcccc@...ux.alibaba.com, Juri Lelli <juri.lelli@...hat.com>,
        khlebnikov@...dex-team.ru,
        open list <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...hat.com>,
        pauld@...head.com, Peter Zijlstra <peterz@...radead.org>,
        Paul Turner <pjt@...gle.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        shanpeic@...ux.alibaba.com, Tejun Heo <tj@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH v5 2/3] sched/fair: Add cfs bandwidth burst statistics



> On May 20, 2021, at 10:11 PM, Odin Ugedal <odin@...d.al> wrote:
> 
> I am a bit sceptical about both the nr_burst and burst_time as they are now.
> 
> As an example; a control group using "99.9%" of the quota each period
> and that is never throttled. Such group would with this patch with a burst of X
> still get nr_throttled = 0 (as before), but it would get a nr_burst
> and burst_time that
> will keep increasing.
> 

Agreed, there are false positive and false negetive cases, as the current implementation
uses cfs_b->runtime to judge instead of the actual runtime used.

> I think there is a big difference between runtime moved/taken from
> cfs_b->runtime to cfs_rq->runtime_remaining and the actual runtime used
> in the period. Currently, cfs bw can only supply info the first one, and
> not the latter.
> 
> I think that if people see nr_burst increasing, that they think they _have_
> to use cfs burst in order to avoid being throttled, even though that might
> not be the case. It is probably fine as is, as long as it is explicitly stated

It can't be seeing nr_burst incresing first, and using cfs burst feature afterwards.
Do you mean people see nr_throttled increasing and use cfs burst, while the actual usage
is below quota? In that case, tasks get throttled because there are runtime to be returned from
cfs_rq, and get unthrottled shortly. That is a false positive for nr_throttled. When users see that,
using burst can help improve.

> what the values mean and imply, and what they do not. I cannot see another
> way to calculate it as it is now, but maybe someone else has some thoughts.
> 
> Thanks
> Odin