linux-kernel - [PATCH v1 0/5] Optimize async device suspend/resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20241114220921.2529905-1-saravanak@google.com>
Date: Thu, 14 Nov 2024 14:09:14 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>, Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>
Cc: Saravana Kannan <saravanak@...gle.com>, Geert Uytterhoeven <geert@...ux-m68k.org>, 
	Marek Vasut <marex@...x.de>, Bird@...gle.com, Tim <Tim.Bird@...y.com>, 
	kernel-team@...roid.com, linux-pm@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: [PATCH v1 0/5] Optimize async device suspend/resume

A lot of the details are in patch 4/5 and 5/5. The summary is that
there's a lot of overhead and wasted work in how async device
suspend/resume is handled today. I talked about this and otther
suspend/resume issues at LPC 2024[1].

You can remove a lot of the overhead by doing a breadth first queuing of
async suspend/resumes. That's what this patch series does. I also
noticed that during resume, because of EAS, we don't use the bigger CPUs
as quickly. This was leading to a lot of scheduling latency and
preemption of runnable threads and increasing the resume latency. So, we
also disable EAS for that tiny period of resume where we know there'll
be a lot of parallelism.

On a Pixel 6, averaging over 100 suspend/resume cycles, this patch
series yields significant improvements:
+---------------------------+-----------+----------------+------------+-------+
| Phase			    | Old full sync | Old full async | New full async |
|			    |		    | 		     | + EAS disabled |
+---------------------------+-----------+----------------+------------+-------+
| Total dpm_suspend*() time |        107 ms |          72 ms |          62 ms |
+---------------------------+-----------+----------------+------------+-------+
| Total dpm_resume*() time  |         75 ms |          90 ms |          61 ms |
+---------------------------+-----------+----------------+------------+-------+
| Sum			    |        182 ms |         162 ms |         123 ms |
+---------------------------+-----------+----------------+------------+-------+

There might be room for some more optimizations in the future, but I'm
keep this patch series simple enough so that it's easier to review and
check that it's not breaking anything. If this series lands and is
stable and no bug reports for a few months, I can work on optimizing
this a bit further.

Thanks,
Saravana
P.S: Cc-ing some usual suspects you might be interested in testing this
out.

[1] - https://lpc.events/event/18/contributions/1845/

Saravana Kannan (5):
  PM: sleep: Fix runtime PM issue in dpm_resume()
  PM: sleep: Remove unnecessary mutex lock when waiting on parent
  PM: sleep: Add helper functions to loop through superior/subordinate
    devs
  PM: sleep: Do breadth first suspend/resume for async suspend/resume
  PM: sleep: Spread out async kworker threads during dpm_resume*()
    phases

 drivers/base/power/main.c | 325 +++++++++++++++++++++++++++++---------
 kernel/power/suspend.c    |  16 ++
 kernel/sched/topology.c   |  13 ++
 3 files changed, 276 insertions(+), 78 deletions(-)

-- 
2.47.0.338.g60cca15819-goog