lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6onPMIxS0ixXxj9@slm.duckdns.org>
Date: Mon, 10 Feb 2025 06:20:12 -1000
From: Tejun Heo <tj@...nel.org>
To: Michal Koutný <mkoutny@...e.com>
Cc: Abel Wu <wuyun.abel@...edance.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Jonathan Corbet <corbet@....net>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Yury Norov <yury.norov@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Bitao Hu <yaoma@...ux.alibaba.com>,
	Chen Ridong <chenridong@...wei.com>,
	"open list:CONTROL GROUP (CGROUP)" <cgroups@...r.kernel.org>,
	"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 3/3] cgroup/rstat: Add run_delay accounting for cgroups

On Mon, Feb 10, 2025 at 04:38:56PM +0100, Michal Koutný wrote:
...
> The challenge is with nr (assuming they're all runnable during Δt), that
> would need to be sampled from /sys/kernel/debug/sched/debug. But then
> you can get whatever load for individual cfs_rqs from there. Hm, does it
> even make sense to add up run_delays from different CPUs?

The difficulty in aggregating across CPUs is why some and full pressures are
defined the way they are. Ideally, we'd want full distribution of stall
states across CPUs but both aggregation and presentation become challenging,
so some/full provide the two extremes. Sum of all cpu_delay adds more
incomplete signal on top. I don't know how useful it'd be. At meta, we
depend on PSI a lot when investigating resource problems and we've never
felt the need for the sum time, so that's one data point with the caveat
that usually our focus is on mem and io pressures where some and full
pressure metrics usually seem to provide sufficient information.

As the picture provided by some and full metrics is incomplete, I can
imagine adding the sum being useful. That said, it'd help if Able can
provide more concrete examples on it being useful. Another thing to consider
is whether we should add this across resources monitored by PSI - cpu, mem
and io.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ