lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250619035355.33402-1-zhangzihuan@kylinos.cn>
Date: Thu, 19 Jun 2025 11:53:51 +0800
From: Zihuan Zhang <zhangzihuan@...inos.cn>
To: rafael@...nel.org,
	pavel@...nel.org,
	len.brown@...el.com
Cc: linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Zihuan Zhang <zhangzihuan@...inos.cn>
Subject: [PATCH v2 0/4] PM: freezer: Add retry stats and D-state task logging for debugging

In our internal testing of system suspend and process freezing behavior,
we observed that the time taken to freeze userspace processes is often
non-deterministic. On certain systems (e.g. desktop with Xorg), the freezer
frequently requires 5 to 6 retry iterations before all tasks are frozen.
Upon investigation, we found that this instability is primarily caused by
user processes entering D-state (TASK_UNINTERRUPTIBLE) during the freeze
window. Since such tasks cannot be frozen and are not killable by signals,
they cause the freezer to repeatedly sleep and retry, resulting in:
- Increased system suspend latency (from hundreds of milliseconds up to several milliseconds)
- Elevated CPU usage due to repeated full task list scans
- In worst cases, freezer may fail
This patchset introduces two debugging aids to help analyze and mitigate
this issue:

- Patch 1: Add retry counter and report how many rounds freezing took
- Patch 2: Log all D-state tasks in each freeze iteration
- Patch 3: ensures pm_pr_dbg() works properly
- Patch 4: simplifies state check

These tools make it easier to identify problematic tasks and evaluate
the impact of uninterruptible waits on suspend/resume stability.
Tested on systems where D-state tasks (e.g. from Xorg or poll-heavy
drivers) regularly delay suspend.

With our patch:
dmesg | grep -E 'elap|round'
[   63.227193] freeze round: 0, task to freeze: 681
[   63.228110] freeze round: 1, task to freeze: 1
[   63.230248] freeze round: 2, task to freeze: 1
[   63.234247] freeze round: 3, task to freeze: 1
[   63.242270] freeze round: 4, task to freeze: 1
[   63.250272] freeze round: 5, task to freeze: 1
[   63.258516] freeze round: 6, task to freeze: 1
[   63.266478] freeze round: 7, task to freeze: 0
[   63.266481] Freezing user space processes completed (elapsed 0.039 seconds)
[   63.266641] freeze round: 0, task to freeze: 4
[   63.267347] freeze round: 1, task to freeze: 0
[   63.267349] Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
[   74.452100] freeze round: 0, task to freeze: 686
[   74.452781] freeze round: 1, task to freeze: 1
[   74.454915] freeze round: 2, task to freeze: 1
[   74.458947] freeze round: 3, task to freeze: 1
[   74.466938] freeze round: 4, task to freeze: 1
[   74.474950] freeze round: 5, task to freeze: 1
[   74.482954] freeze round: 6, task to freeze: 1
[   74.491320] freeze round: 7, task to freeze: 0
[   74.491324] Freezing user space processes completed (elapsed 0.039 seconds)
[   74.491485] freeze round: 0, task to freeze: 4
[   74.492748] freeze round: 1, task to freeze: 0
[   74.492750] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[   85.642102] freeze round: 0, task to freeze: 685
[   85.642841] freeze round: 1, task to freeze: 1
[   85.644974] freeze round: 2, task to freeze: 1
[   85.648978] freeze round: 3, task to freeze: 1
[   85.656972] freeze round: 4, task to freeze: 1
[   85.665231] freeze round: 5, task to freeze: 1
[   85.673492] freeze round: 6, task to freeze: 1
[   85.682035] freeze round: 7, task to freeze: 0
[   85.682039] Freezing user space processes completed (elapsed 0.040 seconds)
[   85.682198] freeze round: 0, task to freeze: 4
[   85.682972] freeze round: 1, task to freeze: 0
[   85.682974] Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
[   96.863597] freeze round: 0, task to freeze: 678
[   96.864298] freeze round: 1, task to freeze: 1
[   96.866430] freeze round: 2, task to freeze: 1
[   96.870477] freeze round: 3, task to freeze: 1
[   96.878486] freeze round: 4, task to freeze: 1
[   96.886463] freeze round: 5, task to freeze: 1
[   96.894745] freeze round: 6, task to freeze: 1
[   96.903120] freeze round: 7, task to freeze: 0
[   96.903124] Freezing user space processes completed (elapsed 0.040 seconds)
[   96.903288] freeze round: 0, task to freeze: 4
[   96.904480] freeze round: 1, task to freeze: 0
[   96.904482] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  108.026657] freeze round: 0, task to freeze: 678
[  108.027363] freeze round: 1, task to freeze: 1
[  108.029476] freeze round: 2, task to freeze: 1
[  108.033527] freeze round: 3, task to freeze: 1
[  108.041510] freeze round: 4, task to freeze: 1
[  108.049774] freeze round: 5, task to freeze: 1
[  108.057485] freeze round: 6, task to freeze: 0
[  108.057487] Freezing user space processes completed (elapsed 0.031 seconds)
[  108.057604] freeze round: 0, task to freeze: 4
[  108.058474] freeze round: 1, task to freeze: 0
[  108.058476] Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
[  119.171536] freeze round: 0, task to freeze: 678
[  119.172301] freeze round: 1, task to freeze: 1
[  119.174402] freeze round: 2, task to freeze: 1
[  119.178445] freeze round: 3, task to freeze: 1
[  119.186442] freeze round: 4, task to freeze: 1
[  119.194445] freeze round: 5, task to freeze: 1
[  119.202425] freeze round: 6, task to freeze: 1
[  119.210572] freeze round: 7, task to freeze: 0
[  119.210576] Freezing user space processes completed (elapsed 0.039 seconds)
[  119.210733] freeze round: 0, task to freeze: 4
[  119.211435] freeze round: 1, task to freeze: 0
[  119.211437] Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
[  130.577681] freeze round: 0, task to freeze: 688
[  130.578819] freeze round: 1, task to freeze: 3
[  130.580489] freeze round: 2, task to freeze: 3
[  130.584525] freeze round: 3, task to freeze: 3
[  130.592511] freeze round: 4, task to freeze: 3
[  130.600496] freeze round: 5, task to freeze: 3
[  130.608495] freeze round: 6, task to freeze: 3
[  130.616642] freeze round: 7, task to freeze: 2
[  130.624597] freeze round: 8, task to freeze: 1
[  130.632868] freeze round: 9, task to freeze: 1
[  130.640430] freeze round: 10, task to freeze: 0
[  130.640434] Freezing user space processes completed (elapsed 0.063 seconds)
[  130.640592] freeze round: 0, task to freeze: 4
[  130.641397] freeze round: 1, task to freeze: 0
[  130.641399] Freezing remaining freezable tasks completed (elapsed 0.000 seconds)
[  142.024896] freeze round: 0, task to freeze: 687
[  142.025680] freeze round: 1, task to freeze: 3
[  142.027855] freeze round: 2, task to freeze: 3
[  142.031872] freeze round: 3, task to freeze: 3
[  142.039876] freeze round: 4, task to freeze: 3
[  142.048215] freeze round: 5, task to freeze: 3
[  142.055875] freeze round: 6, task to freeze: 3
[  142.063839] freeze round: 7, task to freeze: 2
[  142.071957] freeze round: 8, task to freeze: 2
[  142.080048] freeze round: 9, task to freeze: 2
[  142.088052] freeze round: 10, task to freeze: 2
[  142.096595] freeze round: 11, task to freeze: 2
[  142.104908] freeze round: 12, task to freeze: 1
[  142.112904] freeze round: 13, task to freeze: 1
[  142.121407] freeze round: 14, task to freeze: 0
[  142.121411] Freezing user space processes completed (elapsed 0.097 seconds)
[  142.121573] freeze round: 0, task to freeze: 4
[  142.122944] freeze round: 1, task to freeze: 0
[  142.122948] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  153.374096] freeze round: 0, task to freeze: 687
[  153.374840] freeze round: 1, task to freeze: 3
[  153.377042] freeze round: 2, task to freeze: 3
[  153.381074] freeze round: 3, task to freeze: 3
[  153.389017] freeze round: 4, task to freeze: 2
[  153.397331] freeze round: 5, task to freeze: 2
[  153.405600] freeze round: 6, task to freeze: 1
[  153.413910] freeze round: 7, task to freeze: 0
[  153.413913] Freezing user space processes completed (elapsed 0.040 seconds)
[  153.414073] freeze round: 0, task to freeze: 4
[  153.414974] freeze round: 1, task to freeze: 0
[  153.414976] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[  164.640178] freeze round: 0, task to freeze: 687
[  164.640865] freeze round: 1, task to freeze: 3
[  164.643050] freeze round: 2, task to freeze: 3
[  164.647084] freeze round: 3, task to freeze: 3
[  164.655014] freeze round: 4, task to freeze: 2
[  164.663304] freeze round: 5, task to freeze: 2
[  164.671018] freeze round: 6, task to freeze: 2
[  164.679274] freeze round: 7, task to freeze: 1
[  164.687812] freeze round: 8, task to freeze: 0
[  164.687815] Freezing user space processes completed (elapsed 0.048 seconds)
[  164.687971] freeze round: 0, task to freeze: 4
[  164.688969] freeze round: 1, task to freeze: 0
[  164.688971] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)


With our patch:
[   62.880497] PM: suspend entry (deep)
[   63.130639] Filesystems sync: 0.249 seconds
[   63.130643] PM: Preparing system for sleep (deep)
[   63.226398] Freezing user space processes
[   63.227193] freeze round: 0, task to freeze: 681
[   63.228110] freeze round: 1, task to freeze: 1
[   63.230064] task:Xorg            state:D stack:0     pid:1404  tgid:1404  ppid:1348   task_flags:0x400100 flags:0x00004004
[   63.230068] Call Trace:
[   63.230069]  <TASK>
[   63.230071]  __schedule+0x52e/0xea0
[   63.230077]  schedule+0x27/0x80
[   63.230079]  schedule_timeout+0xf2/0x100
[   63.230082]  wait_for_completion+0x85/0x130
[   63.230085]  __flush_work+0x21f/0x310
[   63.230087]  ? __pfx_wq_barrier_func+0x10/0x10
[   63.230091]  drm_mode_rmfb+0x138/0x1b0
[   63.230093]  ? __pfx_drm_mode_rmfb_work_fn+0x10/0x10
[   63.230095]  ? __pfx_drm_mode_rmfb_ioctl+0x10/0x10
[   63.230097]  drm_ioctl_kernel+0xa5/0x100
[   63.230099]  drm_ioctl+0x270/0x4b0
[   63.230101]  ? __pfx_drm_mode_rmfb_ioctl+0x10/0x10
[   63.230104]  ? syscall_exit_work+0x108/0x140
[   63.230107]  radeon_drm_ioctl+0x4a/0x80 [radeon]
[   63.230141]  __x64_sys_ioctl+0x93/0xe0
[   63.230144]  ? syscall_trace_enter+0xfa/0x1c0
[   63.230146]  do_syscall_64+0x7d/0x2c0
[   63.230148]  ? do_syscall_64+0x1f3/0x2c0
[   63.230150]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   63.230153] RIP: 0033:0x7f1aa132550b
[   63.230154] RSP: 002b:00007ffebab69678 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   63.230156] RAX: ffffffffffffffda RBX: 00007ffebab696bc RCX: 00007f1aa132550b
[   63.230158] RDX: 00007ffebab696bc RSI: 00000000c00464af RDI: 000000000000000e
[   63.230159] RBP: 00000000c00464af R08: 00007f1aa0c41220 R09: 000055a71ce32310
[   63.230160] R10: 0000000000000087 R11: 0000000000000246 R12: 000055a71b813660
[   63.230161] R13: 000000000000000e R14: 0000000003a8f5cd R15: 000055a71b6bbfb0
[   63.230164]  </TASK>
[   63.230248] freeze round: 2, task to freeze: 1

Changes in v2:
- Add retry count statistics and print D-state process 

Zihuan Zhang (4):
  PM: freezer: Add retry count statistics for freeze pass iterations
  PM: freezer: Print tasks stuck in D-state during freeze
  PM: suspend: Assign pm_suspend_target_state earlier for valid debug
    logs
  PM: suspend: Simplify state check using sleep_state_supported()

 kernel/power/process.c |  7 +++++++
 kernel/power/suspend.c | 17 +++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ