[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jrU4Xw2wzdUL9Vd2C6u8NVx5J79DeiRY6KU1xT6ZSuqw@mail.gmail.com>
Date: Thu, 27 Jan 2022 20:54:16 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Kelly Rossmoyer <krossmo@...gle.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Lee Jones <lee.jones@...aro.org>,
Vijay Nayak <nayakvij@...gle.com>,
Linux PM <linux-pm@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] PM: suspend: Upstreaming wakeup reason capture support
On Mon, Jan 10, 2022 at 7:49 PM Kelly Rossmoyer <krossmo@...gle.com> wrote:
>
> # Introduction
>
> To aid optimization, troubleshooting, and attribution of battery life, the
> Android kernel currently includes a set of patches which provide enhanced
> visibility into kernel suspend/resume/abort behaviors. The capabilities
> and implementation of this feature have evolved significantly since an
> unsuccessful attempt to upstream the original code
> (https://lkml.org/lkml/2014/3/10/716), and we would like to (re)start a
> conversation about upstreaming, starting with the central question: is
> there support for upstreaming this set of features?
>
> # Motivation
>
> Of the many factors influencing battery life on Linux-powered mobile
> devices, kernel suspend tends to be amongst the most impactful. Maximizing
> time spent in suspend and minimizing the frequency of net-negative suspend
> cycles are both important contributors to battery life optimization. But
> enabling that optimization - and troubleshooting when things go wrong -
> requires more observability of suspend/resume/abort behavior than Linux
> currently provides. While mechanisms like `/sys/power/pm_wakeup_irq` and
> wakeup_source stats are useful, they are incomplete and scattered. The
> Android kernel wakeup reason patches implement significant improvements in
> that area.
>
> # Features
>
> As of today, the active set of patches surface the following
> suspend-related data:
>
> * wakeup IRQs, including:
> * multiple IRQs if more than one is pending during resume flow
> * unmapped HW IRQs (wakeup-capable in HW) that should not be
> occurring
> * misconfigured IRQs (e.g. both enable_irq_wake() and
> IRQF_NO_SUSPEND)
> * threaded IRQs (not just the parent chip's IRQ)
>
> * non-IRQ wakeups, including:
> * wakeups caused by an IRQ that was consumed by lower-level SW
> * wakeups from SOC architecture that don't manifest as IRQs
>
> * abort reasons, including:
> * wakeup_source activity
> * failure to freeze userspace
> * failure to suspend devices
> * failed syscore_suspend callback
>
> * durations from the most recent cycle, including:
> * time spent doing suspend/resume work
> * time spent in suspend
>
> In addition to battery life optimization and troubleshooting, some of these
> capabilities also lay the groundwork for efforts around improving
> attribution of wakeups/aborts (e.g. to specific processes, device features,
> external devices, etc).
>
> # Shortcomings
>
> While the core implementation (see below) is relatively straightforward and
> localized, calls into that core are somewhat widely spread in order to
> capture the breadth of events of interest. The pervasiveness of those
> hooks is clearly an area where improvement would be beneficial, especially
> if a cleaner solution preserved equivalent capabilities.
>
> # Existing Code
>
> As a reference for how Android currently implements the core code for these
> features (which would need a bit of work before submission even if all
> features were included), see the following link:
>
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/kernel/power/wakeup_reason.c
So as Zichar said, this is quite heavy-weight.
I'm not fundamentally against adding more infrastructure to help
identify issues related to system suspend, but there needs to be a
clear benefit associated with any change in this direction. Also
adding significant overhead just for this purpose alone is rather out
of the question.
I would advise you to follow the suggestion to split the work into
smaller pieces and submit them one at a time, possibly starting with
the ones bringing the most significant benefits to the table.
Powered by blists - more mailing lists