linux-kernel - Re: [PATCH v3 0/2] Exposing nice CPU usage to userspace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZvWsWovtTgBi29D9@slm.duckdns.org>
Date: Thu, 26 Sep 2024 08:47:54 -1000
From: Tejun Heo <tj@...nel.org>
To: Michal Koutný <mkoutny@...e.com>
Cc: Joshua Hahn <joshua.hahnjy@...il.com>, cgroups@...r.kernel.org,
	hannes@...xchg.org, linux-kernel@...r.kernel.org,
	linux-kselftest@...r.kernel.org, lizefan.x@...edance.com,
	shuah@...nel.org
Subject: Re: [PATCH v3 0/2] Exposing nice CPU usage to userspace

Hello, Michal.

On Thu, Sep 26, 2024 at 08:10:35PM +0200, Michal Koutný wrote:
...
> On Tue, Sep 10, 2024 at 11:01:07AM GMT, Tejun Heo <tj@...nel.org> wrote:
> > I think it's as useful as system-wide nice metric is.
> 
> Exactly -- and I don't understand how that system-wide value (without
> any cgroups) is useful.
> If I don't know how many there are niced and non-niced tasks and what
> their runnable patterns are, the aggregated nice time can have ambiguous
> interpretations.
> 
> > I think there are benefits to mirroring system wide metrics, at least
> > ones as widely spread as nice.
> 
> I agree with benefits of mirroring of some system wide metrics when they
> are useful <del>but not all of them because it's difficult/impossible to take
> them away once they're exposed</del>. Actually, readers _should_ handle
> missing keys gracefuly, so this may be just fine.
> 
> (Is this nice time widely spread? (I remember the field from `top`, still
> not sure how to use it.) Are other proc_stat(5) fields different?

A personal anecdote: I usually run compile jobs with nice and look at the
nice utilization to see what the system is doing. I think it'd be simliar
for most folks. Because the number has always been there and ubiqutous
across many monitoring tools, people end up using it for something. It's not
a great metric but a long-standing and widely available one, so it ends up
with usages.

BTW, there are numbers which are actively silly - e.g. iowait, especially
due to how it gets aggregated across multiple CPUs. That, we want to
actively drop especially as the pressure metrics is the better substitute. I
don't think nice is in that category. It's not the best metric there is but
not useless or misleading.

> I see how this can be the global analog on leaf cgroups but
> interpretting middle cgroups with children of different cpu.weights?)

I think aggregating per-thread numbers is the right thing to do. It's just
sum of CPU cycles spent by threads which got niced.

Thanks.

-- 
tejun