lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0goQ0DcsWAqn__E7CG=YcNAzdxo9bb-21q50V2H5CJ+xA@mail.gmail.com>
Date: Wed, 16 Jul 2025 14:26:39 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Zihuan Zhang <zhangzihuan@...inos.cn>
Cc: "rafael J . wysocki" <rafael@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	Oleg Nesterov <oleg@...hat.com>, len brown <len.brown@...el.com>, pavel machek <pavel@...nel.org>, 
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce

Hi,

On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@...inos.cn> wrote:
>
> Hi all,
>
> This patch series improves the performance of the process freezer by
> skipping zombie tasks during freezing.
>
> In the suspend and hibernation paths, the freezer traverses all tasks
> and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
> PF_EXITING) are already dead — they are not schedulable and cannot enter
> the refrigerator. Attempting to freeze such tasks is redundant and
> unnecessarily increases freezing time.
>
> In particular, on systems under fork storm conditions (e.g., many
> short-lived processes quickly becoming zombies), the number of zombie tasks
> can spike into the thousands or more. We observed that this causes the
> freezer loop to waste significant time processing tasks that are guaranteed
> to not need freezing.

I think that the discussion with Peter regarding this has not been concluded.

I thought that there was an alternative patch proposed during that
discussion.  If I'm not mistaken about this, what happened to that
patch?

Thanks!

> Testing and Results
>
> Platform:
> - Architecture: x86_64
> - CPU: ZHAOXIN KaiXian KX-7000
> - RAM: 16 GB
> - Kernel: v6.6.93
>
> result without the patch:
> dmesg | grep elap
> [  219.608992] Freezing user space processes completed (elapsed 0.010 seconds)
> [  219.617355] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
> [  228.029119] Freezing user space processes completed (elapsed 0.013 seconds)
> [  228.040672] Freezing remaining freezable tasks completed (elapsed 0.011 seconds)
> [  236.879065] Freezing user space processes completed (elapsed 0.020 seconds)
> [  236.897976] Freezing remaining freezable tasks completed (elapsed 0.018 seconds)
> [  246.276679] Freezing user space processes completed (elapsed 0.026 seconds)
> [  246.298636] Freezing remaining freezable tasks completed (elapsed 0.021 seconds)
> [  256.221504] Freezing user space processes completed (elapsed 0.030 seconds)
> [  256.248955] Freezing remaining freezable tasks completed (elapsed 0.027 seconds)
> [  266.674987] Freezing user space processes completed (elapsed 0.040 seconds)
> [  266.709811] Freezing remaining freezable tasks completed (elapsed 0.034 seconds)
> [  277.701679] Freezing user space processes completed (elapsed 0.046 seconds)
> [  277.742048] Freezing remaining freezable tasks completed (elapsed 0.040 seconds)
> [  289.246611] Freezing user space processes completed (elapsed 0.046 seconds)
> [  289.290753] Freezing remaining freezable tasks completed (elapsed 0.044 seconds)
> [  301.516854] Freezing user space processes completed (elapsed 0.041 seconds)
> [  301.576287] Freezing remaining freezable tasks completed (elapsed 0.059 seconds)
> [  314.422499] Freezing user space processes completed (elapsed 0.043 seconds)
> [  314.465804] Freezing remaining freezable tasks completed (elapsed 0.043 seconds)
>
> result with the patch:
> dmesg | grep elap
> [   54.161674] Freezing user space processes completed (elapsed 0.007 seconds)
> [   54.171536] Freezing remaining freezable tasks completed (elapsed 0.009 seconds)
> [   62.556462] Freezing user space processes completed (elapsed 0.006 seconds)
> [   62.566496] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
> [   71.395421] Freezing user space processes completed (elapsed 0.009 seconds)
> [   71.402820] Freezing remaining freezable tasks completed (elapsed 0.007 seconds)
> [   80.785463] Freezing user space processes completed (elapsed 0.010 seconds)
> [   80.793914] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
> [   90.962659] Freezing user space processes completed (elapsed 0.012 seconds)
> [   90.973519] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
> [  101.435638] Freezing user space processes completed (elapsed 0.013 seconds)
> [  101.449023] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [  112.669786] Freezing user space processes completed (elapsed 0.015 seconds)
> [  112.683540] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [  124.585681] Freezing user space processes completed (elapsed 0.017 seconds)
> [  124.599553] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [  136.826635] Freezing user space processes completed (elapsed 0.016 seconds)
> [  136.841840] Freezing remaining freezable tasks completed (elapsed 0.015 seconds)
> [  149.686575] Freezing user space processes completed (elapsed 0.016 seconds)
> [  149.701549] Freezing remaining freezable tasks completed (elapsed 0.014 seconds)
>
> Here is the user-space fork storm simulator used for testing:
>
> ```c
> // create_zombie.c
>
> void usage(const char *prog) {
>     fprintf(stderr, "Usage: %s <number_of_zombies>\n", prog);
>     exit(EXIT_FAILURE);
> }
>
> int main(int argc, char *argv[]) {
>     if (argc != 2) {
>         usage(argv[0]);
>     }
>
>     long num_zombies = strtol(argv[1], NULL, 10);
>     if (num_zombies <= 0 || num_zombies > 1000000) {
>         fprintf(stderr, "Invalid number of zombies: %ld\n", num_zombies);
>         exit(EXIT_FAILURE);
>     }
>
>     printf("Creating %ld zombie processes...\n", num_zombies);
>
>     for (long i = 0; i < num_zombies; i++) {
>         pid_t pid = fork();
>         if (pid < 0) {
>             perror("fork failed");
>             exit(EXIT_FAILURE);
>         } else if (pid == 0) {
>             // Child exits immediately
>             exit(0);
>         }
>         // Parent does NOT wait, leaving zombie
>     }
>
>     printf("All child processes created. Sleeping for 60 seconds...\n");
>     sleep(60);
>
>     printf("Parent exiting, zombies will be reaped by init.\n");
>     return 0;
> }
> ```
>
> And we used a shell loop to suspend repeatedly:
>
> ```bash
> LOOPS=10
>
> echo none > /sys/power/pm_test
> echo 5 > /sys/module/suspend/parameters/pm_test_delay
> for ((i=1; i<=LOOPS; i++)); do
> echo "===== Test round $i/$LOOPS ====="
> ./create_zombie $((i * 3000)) &
> sleep 5
> echo mem > /sys/power/state
>
> pkill create_zombie
> echo "Round $i complete. Waiting 5s..."
> sleep 5
>
> done
> echo "==== All $LOOPS rounds complete ===="
> ```
>
> Zihuan Zhang (1):
>   PM / Freezer: Skip zombie/dead processes to reduce freeze latency
>
>  kernel/power/process.c | 2 +-
>  1 file changed, 9 insertion(+), 1 deletion(-)
>
> --
> 2.25.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ