lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 21 May 2021 16:05:19 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Odin Ugedal <odin@...d.al>
Cc:     Huaixin Chang <changhuaixin@...ux.alibaba.com>,
        Benjamin Segall <bsegall@...gle.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        dtcccc@...ux.alibaba.com, Juri Lelli <juri.lelli@...hat.com>,
        khlebnikov@...dex-team.ru,
        open list <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...hat.com>,
        pauld@...head.com, Paul Turner <pjt@...gle.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        shanpeic@...ux.alibaba.com, Tejun Heo <tj@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH v5 2/3] sched/fair: Add cfs bandwidth burst statistics

On Thu, May 20, 2021 at 04:11:52PM +0200, Odin Ugedal wrote:
> I am a bit sceptical about both the nr_burst and burst_time as they are now.
> 
> As an example; a control group using "99.9%" of the quota each period
> and that is never throttled. Such group would with this patch with a burst of X
> still get nr_throttled = 0 (as before), but it would get a nr_burst
> and burst_time that
> will keep increasing.
> 
> I think there is a big difference between runtime moved/taken from
> cfs_b->runtime to cfs_rq->runtime_remaining and the actual runtime used
> in the period. Currently, cfs bw can only supply info the first one, and
> not the latter.
> 
> I think that if people see nr_burst increasing, that they think they _have_
> to use cfs burst in order to avoid being throttled, even though that might
> not be the case. It is probably fine as is, as long as it is explicitly stated
> what the values mean and imply, and what they do not. I cannot see another
> way to calculate it as it is now, but maybe someone else has some thoughts.

You can always trace the system. I don't think we have nice tracepoints
for any of this, but much can be inferred from the scheduler and hrtimer
tracepoints. Also kprobe might be empoloyed to stick in more appropriate
thingies I suppose.

You can also run the workload without bandwidth controls and measure
it's job execution times, and from that compute the bandwidth settings,
all without tracepoints.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ