[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b161b92e-c290-4a85-adc9-fbc325568e7d@kylinos.cn>
Date: Mon, 9 Jun 2025 11:46:19 +0800
From: zhangzihuan <zhangzihuan@...inos.cn>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: David Hildenbrand <david@...hat.com>, rafael@...nel.org,
len.brown@...el.com, pavel@...nel.org, kees@...nel.org, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com, akpm@...ux-foundation.org,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
rppt@...nel.org, surenb@...gle.com, mhocko@...e.com,
linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH] PM: Optionally block user fork during freeze to
improve performance
Hi Mateusz,
Thanks again for your detailed input.
You’re absolutely right that try_to_freeze_tasks() holds the
tasklist_lock during the main freeze loop, which temporarily blocks
kernel_clone() at that stage. However, based on our observations and
logs, the problem arises just after this loop, in the short window
before the system enters suspend (e.g., around the brief usleep()
period), when the lock is released and fork is once again possible.
To illustrate this, I’d like to share some dmesg logs gathered during a
series of S3 suspend attempts. In most suspend cycles, we intentionally
run a user-space process that forks rapidly during suspend, and we
observe multiple retries during the “freezing user space processes”
phase. Below are selected entries
在 2025/6/8 23:50, Mateusz Guzik 写道:
> On Sun, Jun 08, 2025 at 03:22:20PM +0800, zhangzihuan wrote:
>> One alternative could be to block in kernel_clone() until freezing ends,
>> instead of returning an error. That way, fork() would not fail, just
>> potentially block briefly (similar to memory pressure or cgroup limits). Do
>> you think that's more acceptable?
> So I had a look at the freezing loop and it operates with
> tasklist_lock held, meaning it already stalls clone().
>
> try_to_freeze_tasks() in kernel/power/process.c contains:
>
> todo = 0;
> read_lock(&tasklist_lock);
> for_each_process_thread(g, p) {
> if (p == current || !freeze_task(p))
> continue;
>
> todo++;
> }
> read_unlock(&tasklist_lock);
>
> I don't get where the assumption that fork itself is a factor is coming
> from.
>
> Looking at freezing itself it seems to me perf trouble starts with tons
> of processes existing to begin with in arbitrary states (not with racing
> against fork), requring a retry with preceeded by a sleep:
>
> /*
> * We need to retry, but first give the freezing tasks some
> * time to enter the refrigerator. Start with an initial
> * 1 ms sleep followed by exponential backoff until 8 ms.
> */
> usleep_range(sleep_usecs / 2, sleep_usecs);
> if (sleep_usecs < 8 * USEC_PER_MSEC)
> sleep_usecs *= 2;
>
> For a race against fork to have any effect, the new thread has to be
> linked in to the global list -- otherwise the todo var wont get bumped.
>
> But then if it gets added in a state which is freezable, the racing fork
> did not cause any trouble.
>
> If it gets added in a state which is *NOT* freezable by the current
> code, maybe it should be patched to be freezable.
>
> All in all I'm not confident any of this warrants any work -- do you
> have a setup where the above causes a real problem?
Here is the log:
dmesg | grep -E 'elap|Files|retry'
[ 2556.566183] Filesystems sync: 0.012 seconds
[ 2556.570653] Freeing user space processes todo:1181 retry:0
[ 2556.572719] Freeing user space processes todo:0 retry:1
[ 2556.572730] Freezing user space processes completed (elapsed 0.006
seconds)
[ 2556.573243] Freeing remaining freezable tasks todo:13 retry:0
[ 2556.574326] Freeing remaining freezable tasks todo:0 retry:1
[ 2556.574333] Freezing remaining freezable tasks completed (elapsed
0.001 seconds)
[ 2560.647576] Filesystems sync: 0.018 seconds
[ 2560.656691] Freeing user space processes todo:2656 retry:0
[ 2560.661194] Freeing user space processes todo:327 retry:1
[ 2560.664130] Freeing user space processes todo:0 retry:2
[ 2560.664139] Freezing user space processes completed (elapsed 0.016
seconds)
[ 2560.665475] Freeing remaining freezable tasks todo:13 retry:0
[ 2560.667159] Freeing remaining freezable tasks todo:0 retry:1
[ 2560.667170] Freezing remaining freezable tasks completed (elapsed
0.003 seconds)
[ 2564.746592] Filesystems sync: 0.013 seconds
[ 2564.761025] Freeing user space processes todo:4192 retry:0
[ 2564.768048] Freeing user space processes todo:252 retry:1
[ 2564.773774] Freeing user space processes todo:0 retry:2
[ 2564.773801] Freezing user space processes completed (elapsed 0.026
seconds)
[ 2564.776704] Freeing remaining freezable tasks todo:13 retry:0
[ 2564.781867] Freeing remaining freezable tasks todo:0 retry:1
[ 2564.781887] Freezing remaining freezable tasks completed (elapsed
0.008 seconds)
[ 2568.872805] Filesystems sync: 0.010 seconds
[ 2568.893397] Freeing user space processes todo:5897 retry:0
[ 2568.903089] Freeing user space processes todo:0 retry:1
[ 2568.903102] Freezing user space processes completed (elapsed 0.030
seconds)
[ 2568.907681] Freeing remaining freezable tasks todo:13 retry:0
[ 2568.914721] Freeing remaining freezable tasks todo:0 retry:1
[ 2568.914743] Freezing remaining freezable tasks completed (elapsed
0.011 seconds)
[ 2573.019240] Filesystems sync: 0.018 seconds
[ 2573.044573] Freeing user space processes todo:7536 retry:0
[ 2573.056378] Freeing user space processes todo:261 retry:1
[ 2573.062016] Freeing user space processes todo:0 retry:2
[ 2573.062024] Freezing user space processes completed (elapsed 0.042
seconds)
[ 2573.067114] Freeing remaining freezable tasks todo:13 retry:0
[ 2573.072597] Freeing remaining freezable tasks todo:0 retry:1
[ 2573.072604] Freezing remaining freezable tasks completed (elapsed
0.010 seconds)
[ 2577.176003] Filesystems sync: 0.013 seconds
[ 2577.210773] Freeing user space processes todo:9042 retry:0
[ 2577.226116] Freeing user space processes todo:637 retry:1
[ 2577.233723] Freeing user space processes todo:0 retry:2
[ 2577.233733] Freezing user space processes completed (elapsed 0.057
seconds)
[ 2577.240897] Freeing remaining freezable tasks todo:13 retry:0
[ 2577.250898] Freeing remaining freezable tasks todo:0 retry:1
[ 2577.250928] Freezing remaining freezable tasks completed (elapsed
0.017 seconds)
[ 2581.358613] Filesystems sync: 0.014 seconds
[ 2581.397288] Freeing user space processes todo:10397 retry:0
[ 2581.415191] Freeing user space processes todo:107 retry:1
[ 2581.423085] Freeing user space processes todo:0 retry:2
[ 2581.423094] Freezing user space processes completed (elapsed 0.064
seconds)
[ 2581.431079] Freeing remaining freezable tasks todo:13 retry:0
[ 2581.441576] Freeing remaining freezable tasks todo:0 retry:1
[ 2581.441596] Freezing remaining freezable tasks completed (elapsed
0.018 seconds)
[ 2585.572128] Filesystems sync: 0.016 seconds
[ 2585.617543] Freeing user space processes todo:12330 retry:0
[ 2585.638997] Freeing user space processes todo:1227 retry:1
[ 2585.648592] Freeing user space processes todo:0 retry:2
[ 2585.648602] Freezing user space processes completed (elapsed 0.076
seconds)
[ 2585.658063] Freeing remaining freezable tasks todo:13 retry:0
[ 2585.670385] Freeing remaining freezable tasks todo:0 retry:1
[ 2585.670405] Freezing remaining freezable tasks completed (elapsed
0.021 seconds)
[ 2589.810371] Filesystems sync: 0.014 seconds
[ 2589.865483] Freeing user space processes todo:14036 retry:0
[ 2589.893513] Freeing user space processes todo:1288 retry:1
[ 2589.904032] Freeing user space processes todo:0 retry:2
[ 2589.904040] Freezing user space processes completed (elapsed 0.093
seconds)
[ 2589.914322] Freeing remaining freezable tasks todo:13 retry:0
[ 2589.925185] Freeing remaining freezable tasks todo:0 retry:1
[ 2589.925191] Freezing remaining freezable tasks completed (elapsed
0.021 seconds)
[ 2594.088171] Filesystems sync: 0.013 seconds
[ 2594.145012] Freeing user space processes todo:15947 retry:0
[ 2594.175153] Freeing user space processes todo:1521 retry:1
[ 2594.187060] Freeing user space processes todo:0 retry:2
[ 2594.187071] Freezing user space processes completed (elapsed 0.098
seconds)
[ 2594.199270] Freeing remaining freezable tasks todo:13 retry:0
[ 2594.215446] Freeing remaining freezable tasks todo:0 retry:1
[ 2594.215468] Freezing remaining freezable tasks completed (elapsed
0.028 seconds)
However, in the last suspend cycle, we do not execute the fork script
and the result is quite different:
[ 2678.840809] Filesystems sync: 0.010 seconds
[ 2678.928107] Freeing user space processes todo:16673 retry:0
[ 2678.950744] Freeing user space processes todo:0 retry:1
[ 2678.950759] Freezing user space processes completed (elapsed 0.109
seconds)
[ 2678.971389] Freeing remaining freezable tasks todo:13 retry:0
[ 2678.996021] Freeing remaining freezable tasks todo:0 retry:1
[ 2678.996043] Freezing remaining freezable tasks completed (elapsed
0.045 seconds)
(include the one with only 1 retry, e.g.:
[ 2678.928107] Freeing user space processes todo:16673 retry:0
[ 2678.950744] Freeing user space processes todo:0 retry:1
This pattern is repeatable: when fork is allowed during the
freeze/suspend window, we consistently hit multiple retries; when fork
is disabled during that time, the freeze proceeds quickly and smoothly
with just one retry.
This indicates that new user processes created after the freezing loop
begins are interfering with the suspend, which is consistent with a fork
escape scenario. Since the current code doesn’t prevent forks once
tasklist_lock is released, a new child process can be created and escape
freezing altogether — leading to the need for retries and sometimes
suspend failure.
Hope this helps clarify the issue. Happy to provide further logs or
testing as needed.
Powered by blists - more mailing lists