lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ry6p5w3p4l7pnsovyapu6n2by7f4zl63c7umwut2ngdxinx6fs@yu53tunbkxdi>
Date: Mon, 30 Jun 2025 19:40:28 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Tiffany Yang <ynaffit@...gle.com>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org, 
	kernel-team@...roid.com, John Stultz <jstultz@...gle.com>, 
	Thomas Gleixner <tglx@...utronix.de>, Stephen Boyd <sboyd@...nel.org>, 
	Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker <frederic@...nel.org>, 
	Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, Pavel Machek <pavel@...nel.org>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Chen Ridong <chenridong@...wei.com>, 
	Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>
Subject: Re: [RFC PATCH] cgroup: Track time in cgroup v2 freezer

On Fri, Jun 27, 2025 at 12:47:23AM -0700, Tiffany Yang <ynaffit@...gle.com> wrote:
> In our case, the deadline is meant to be relative to the time our task
> spends running; since we don't have a clock for that, we set our timer
> against the system time (CLOCK_MONOTONIC, in this case) as an
> approximation.

Would it be sufficient to measure that deadline against
cpu.stat:usage_usec (CPU time consumed by the cgroup)? Or do I
misunderstand your latter deadline metric?

> Adding it to /proc/<pid>/stat is an option, but because this metric
> isn't very widely used and exactly what it measures is pretty particular
> ("freezer time, but no, cgroup freezer time, but v2 and not v1"), we
> were hesitant to add it there and make this interface even more
> difficult for folks to parse.

Yeah, it'd need strong use case to add it there.

> Thank you for asking this! This is a very helpful question. My answer is
> that other causes of delay may be equally important, but this is another
> place where things get messy because of the spectrum of types of
> "delay". If we break delays into 2 categories, delays that were
> requested (sleep) and delays that were not (SIGSTOP), I can say that we
> are primarily interested in delays that were not requested.

(Note that SIGSTOP may be sent to self or within the group but) mind
that even the category "not requested" is split into two other: resource
contention and freezing management. And the latter should be under
control of the agent that sets the deadlines.

> However, there are many cases that fall somewhere in between, like the
> wakeup latency after a sleep, or that are difficult to account for,
> like blocking on a futex (requested), where the owner might be
> preempted (not requested).

Those are order(s) of magnitude different. I can't imagine that using
freezer for jobs where also wakeup latency matters.


> Ideally, we could abstract this out in a more general way to other
> delays (like SIGSTOP), but the challenge here is that there isn't a
> clear line that separates a problematic delay from an acceptable
> delay. Suggestions for a framework to approach this more generally are
> very welcome.

Well, there are multiple similar metrics: various (cgroup) PSI, (global)
steal time, cpu.stat:throttled_usage and perhaps some more.

> In the meantime, focusing on task frozen/stopped time seems like the
> most reasonable approach. Maybe that would be clear enough to make it
> palatable for proc/<pid>/stat ?

Tejun's suggestion with tracking cgroup's frozen time of whole cgroup
could complement other "debugging" stats provided by cgroups by I tend
to think that it's not good (and certainly not complete) solution to
your problem.

Regards,
Michal

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ