[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJSpTpB9_jijiO6m@tiehlicka>
Date: Thu, 7 Aug 2025 15:25:34 +0200
From: Michal Hocko <mhocko@...e.com>
To: Zihuan Zhang <zhangzihuan@...inos.cn>
Cc: "Rafael J . Wysocki" <rafael@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Oleg Nesterov <oleg@...hat.com>,
David Hildenbrand <david@...hat.com>,
Jonathan Corbet <corbet@....net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
len brown <len.brown@...el.com>, pavel machek <pavel@...nel.org>,
Kees Cook <kees@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Nico Pache <npache@...hat.com>, xu xin <xu.xin16@....com.cn>,
wangfushuai <wangfushuai@...du.com>,
Andrii Nakryiko <andrii@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Jeff Layton <jlayton@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
Adrian Ratiu <adrian.ratiu@...labora.com>, linux-pm@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to
address process dependency issues
On Thu 07-08-25 20:14:09, Zihuan Zhang wrote:
> The Linux task freezer was designed in a much earlier era, when userspace was relatively simple and flat.
> Over the years, as modern desktop and mobile systems have become increasingly complex—with intricate IPC,
> asynchronous I/O, and deep event loops—the original freezer model has shown its age.
A modern userspace might be more complex or convoluted but I do not
think the above statement is accurate or even correct.
> ## Background
>
> Currently, the freezer traverses the task list linearly and attempts to freeze all tasks equally.
> It sends a signal and waits for `freezing()` to become true. While this model works well in many cases, it has several inherent limitations:
>
> - Signal-based logic cannot freeze uninterruptible (D-state) tasks
> - Dependencies between processes can cause freeze retries
> - Retry-based recovery introduces unpredictable suspend latency
>
> ## Real-world problem illustration
>
> Consider the following scenario during suspend:
>
> Freeze Window Begins
>
> [process A] - epoll_wait()
> │
> ▼
> [process B] - event source (already frozen)
>
> → A enters D-state because of waiting for B
I thought opoll_wait was waiting in interruptible sleep.
> → Cannot respond to freezing signal
> → Freezer retries in a loop
> → Suspend latency spikes
>
> In such cases, we observed that a normal 1–2ms freezer cycle could balloon to **tens of milliseconds**.
> Worse, the kernel has no insight into the root cause and simply retries blindly.
>
> ## Proposed solution: Freeze priority model
>
> To address this, we propose a **layered freeze model** based on per-task freeze priorities.
>
> ### Design
>
> We introduce 4 levels of freeze priority:
>
>
> | Priority | Level | Description |
> |----------|-------------------|-----------------------------------|
> | 0 | HIGH | D-state TASKs |
> | 1 | NORMAL | regular use space TASKS |
> | 2 | LOW | not yet used |
> | 4 | NEVER_FREEZE | zombie TASKs , PF_SUSPNED_TASK |
>
>
> The kernel will freeze processes **in priority order**, ensuring that higher-priority tasks are frozen first.
> This avoids dependency inversion scenarios and provides a deterministic path forward for tricky cases.
> By freezing control or event-source threads first, we prevent dependent tasks from entering D-state prematurely — effectively avoiding dependency inversion.
I really fail to see how that is supposed to work to be honest. If a
process is running in the userspace then the priority shouldn't really
matter much. Tasks will get a signal, freeze themselves and you are
done. If they are running in the userspace and e.g. sleeping while not
TASK_FREEZABLE then priority simply makes no difference. And if they are
TASK_FREEZABLE then the priority doens't matter either.
What am I missing?
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists