linux-kernel - Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS bandwidth burst

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YFNsKGKRL3SaJNZk@hirez.programming.kicks-ass.net>
Date:   Thu, 18 Mar 2021 16:05:12 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     changhuaixin <changhuaixin@...ux.alibaba.com>
Cc:     Benjamin Segall <bsegall@...gle.com>, dietmar.eggemann@....com,
        juri.lelli@...hat.com, khlebnikov@...dex-team.ru,
        open list <linux-kernel@...r.kernel.org>, mgorman@...e.de,
        mingo@...hat.com, Odin Ugedal <odin@...d.al>,
        Odin Ugedal <odin@...dal.com>, pauld@...head.com,
        Paul Turner <pjt@...gle.com>, rostedt@...dmis.org,
        Shanpei Chen <shanpeic@...ux.alibaba.com>,
        Tejun Heo <tj@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH v4 1/4] sched/fair: Introduce primitives for CFS
 bandwidth burst

On Thu, Mar 18, 2021 at 09:26:58AM +0800, changhuaixin wrote:
> > On Mar 17, 2021, at 4:06 PM, Peter Zijlstra <peterz@...radead.org> wrote:

> > So what is the typical avg,stdev,max and mode for the workloads where you find
> > you need this?
> > 
> > I would really like to put a limit on the burst. IMO a workload that has
> > a burst many times longer than the quota is plain broken.
> 
> I see. Then the problem comes down to how large the limit on burst shall be.
> 
> I have sampled the CPU usage of a bursty container in 100ms periods. The statistics are:

So CPU usage isn't exactly what is required, job execution time is what
you're after. Assuming there is a relation...

> average	: 42.2%
> stddev	: 81.5%
> max		: 844.5%
> P95		: 183.3%
> P99		: 437.0%

Then your WCET is 844% of 100ms ? , which is .84s.

But you forgot your mode; what is the most common duration, given P95 is
so high, I doubt that avg is representative of the most common duration.

> If quota is 100000ms, burst buffer needs to be 8 times more in order
> for this workload not to be throttled.

Where does that 100s come from? And an 800s burst is bizarre.

Did you typo [us] as [ms] ?

> I can't say this is typical, but these workloads exist. On a machine
> running Kubernetes containers, where there is often room for such
> burst and the interference is hard to notice, users would prefer
> allowing such burst to being throttled occasionally.

Users also want ponies. I've no idea what kubernetes actually is or what
it has to do with containers. That's all just word salad.

> In this sense, I suggest limit burst buffer to 16 times of quota or
> around. That should be enough for users to improve tail latency caused
> by throttling. And users might choose a smaller one or even none, if
> the interference is unacceptable. What do you think?

Well, normal RT theory would suggest you pick your runtime around 200%
to get that P95 and then allow a full period burst to get your P99, but
that same RT theory would also have you calculate the resulting
interference and see if that works with the rest of the system...

16 times is horrific.