linux-kernel - Re: [PATCH v6 2/3] sched/fair: Add cfs bandwidth burst statistics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YNnkDnJtliEInwTY@hirez.programming.kicks-ass.net>
Date:   Mon, 28 Jun 2021 17:00:30 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Huaixin Chang <changhuaixin@...ux.alibaba.com>
Cc:     luca.abeni@...tannapisa.it, anderson@...unc.edu, baruah@...tl.edu,
        bsegall@...gle.com, dietmar.eggemann@....com,
        dtcccc@...ux.alibaba.com, juri.lelli@...hat.com,
        khlebnikov@...dex-team.ru, linux-kernel@...r.kernel.org,
        mgorman@...e.de, mingo@...hat.com, odin@...d.al, odin@...dal.com,
        pauld@...head.com, pjt@...gle.com, rostedt@...dmis.org,
        shanpeic@...ux.alibaba.com, tj@...nel.org,
        tommaso.cucinotta@...tannapisa.it, vincent.guittot@...aro.org,
        xiyou.wangcong@...il.com
Subject: Re: [PATCH v6 2/3] sched/fair: Add cfs bandwidth burst statistics

On Mon, Jun 21, 2021 at 05:27:59PM +0800, Huaixin Chang wrote:
> The following statistics in cpu.stat file is added to show how much workload
> is making use of cfs_b burst:
> 
> nr_bursts:  number of periods bandwidth burst occurs
> burst_usec: cumulative wall-time that any cpus has
> 	    used above quota in respective periods
> 
> The larger nr_bursts is, the more bursty periods there are. And the larger
> burst_usec is, the more burst time is used by bursty workload.

That's what it does, but fails to explain why. How is this number
useful.

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 53d7cc4d009b..62b73722e510 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4634,11 +4634,22 @@ static inline u64 sched_cfs_bandwidth_slice(void)
>   */
>  void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b)
>  {
> +	u64 runtime;
> +
>  	if (unlikely(cfs_b->quota == RUNTIME_INF))
>  		return;
>  
> +	if (cfs_b->runtime_at_period_start > cfs_b->runtime) {
> +		runtime = cfs_b->runtime_at_period_start - cfs_b->runtime;

That comparison is the same as the subtraction; might as well write
this:

> +		if (runtime > cfs_b->quota) {
> +			cfs_b->burst_time += runtime - cfs_b->quota;

Same here.

> +			cfs_b->nr_burst++;
> +		}
> +	}


Perhaps we can write that like:

	s64 runtime = cfs_b->runtime_snapshot - cfs_b->runtime;
	if (runtime > 0) {
		s64 burstime = runtime - cfs_q->quota;
		if (burstime > 0) {
			cfs_b->bust_time += bursttime;
			cfs_b->nr_bursts++;
		}
	}

I was hoping we could get away with something simpler, like maybe:

	u64 old_runtim = cfs_b->runtime;

	cfs_b->runtime += cfs_b->quota
	cfs_b->runtime = min(cfs_b->runtime, cfs_b->quota + cfs_b->burst);

	if (cfs_b->runtime - old_runtime > cfs_b->quota)
		cfs_b->nr_bursts++;

Would that be good enough?


> +
>  	cfs_b->runtime += cfs_b->quota;
>  	cfs_b->runtime = min(cfs_b->runtime, cfs_b->quota + cfs_b->burst);
> +	cfs_b->runtime_at_period_start = cfs_b->runtime;
>  }
>  
>  static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index d317ca74a48c..b770b553dfbb 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -367,6 +367,7 @@ struct cfs_bandwidth {
>  	u64			quota;
>  	u64			runtime;
>  	u64			burst;
> +	u64			runtime_at_period_start;
>  	s64			hierarchical_quota;

As per the above, I don't really like that name, runtime_snapshot or
perhaps runtime_snap is shorter and not less clear. But not having it at
all would be even better.