[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dbx8y0u7i9e6.fsf@ynaffit-andsys.c.googlers.com>
Date: Wed, 04 Jun 2025 19:39:29 +0000
From: Tiffany Yang <ynaffit@...gle.com>
To: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
kernel-team@...roid.com, John Stultz <jstultz@...gle.com>, Thomas
Gleixner <tglx@...utronix.de>, Stephen Boyd <sboyd@...nel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker
<frederic@...nel.org>, Johannes Weiner <hannes@...xchg.org>, Michal
Koutný <mkoutny@...e.com>, "Rafael J. Wysocki"
<rafael@...nel.org>,
Pavel Machek <pavel@...nel.org>, Roman Gushchin
<roman.gushchin@...ux.dev>, Chen Ridong <chenridong@...wei.com>, Ingo
Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Juri
Lelli <juri.lelli@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, Ben
Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, Valentin
Schneider <vschneid@...hat.com>
Subject: Re: [RFC PATCH] cgroup: Track time in cgroup v2 freezer
Tejun Heo <tj@...nel.org> writes:
> On Tue, Jun 03, 2025 at 10:43:05PM +0000, Tiffany Yang wrote:
>> The cgroup v2 freezer controller allows user processes to be dynamically
>> added to and removed from an interruptible frozen state from
>> userspace. This feature is helpful for application management, as it
>> allows background tasks to be frozen to prevent them from being
>> scheduled or otherwise contending with foreground tasks for resources.
>> Still, applications are usually unaware of their having been placed in
>> the freezer cgroup, so any watchdog timers they may have set will fire
>> when they exit. To address this problem, I propose tracking the per-task
>> frozen time and exposing it to userland via procfs.
>
> Just on a glance, it feels rather odd to be tracking this per task given
> that the state is per cgroup. Can you account this per cgroup?
>
> Thanks.
Hi Tejun!
Thanks for taking a look! In this case, I would argue that the value we
are accounting for (time that a task has not been able to run because it
is in the cgroup v2 frozen state) is task-specific and distinct from the
time that the cgroup it belongs to has been frozen.
A cgroup is not considered frozen until all of its members are frozen,
and if one task then leaves the frozen state, the entire cgroup is
considered no longer frozen, even if its other members stay in the
frozen state. Similarly, even if a task is migrated from one frozen
cgroup (A) to another frozen cgroup (B), the time cgroup B has been
frozen would not be representative of that task even though it is a
member.
There is also latency between when each task in a cgroup is marked as
to-be-frozen/unfrozen and when it actually enters the frozen state, so
each descendant task has a different frozen time. For watchdogs that
elapse on a per-task basis, a per-cgroup time-in-frozen value would
underreport the actual time each task spent unable to run. Tasks that
miss a deadline might incorrectly be considered misbehaving when the
time they spent suspended was not correctly accounted for.
Please let me know if that answers your question or if there's something
I'm missing. I agree that it would be cleaner/preferable to keep this
accounting under a cgroup-specific umbrella, so I hope there is some way
to get around these issues, but it doesn't look like cgroup fs has a
good way to keep task-specific stats at the moment.
--
Tiffany Y. Yang
Powered by blists - more mailing lists