[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190319225144.GA80186@google.com>
Date: Wed, 20 Mar 2019 07:51:44 +0900
From: Minchan Kim <minchan@...nel.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: gregkh@...uxfoundation.org, tj@...nel.org, lizefan@...wei.com,
hannes@...xchg.org, axboe@...nel.dk, dennis@...nel.org,
dennisszhou@...il.com, mingo@...hat.com, peterz@...radead.org,
akpm@...ux-foundation.org, corbet@....net, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v5 0/7] psi: pressure stall monitors v5
On Fri, Mar 08, 2019 at 10:43:04AM -0800, Suren Baghdasaryan wrote:
> This is respin of:
> https://lwn.net/ml/linux-kernel/20190206023446.177362-1-surenb%40google.com/
>
> Android is adopting psi to detect and remedy memory pressure that
> results in stuttering and decreased responsiveness on mobile devices.
>
> Psi gives us the stall information, but because we're dealing with
> latencies in the millisecond range, periodically reading the pressure
> files to detect stalls in a timely fashion is not feasible. Psi also
> doesn't aggregate its averages at a high-enough frequency right now.
>
> This patch series extends the psi interface such that users can
> configure sensitive latency thresholds and use poll() and friends to
> be notified when these are breached.
>
> As high-frequency aggregation is costly, it implements an aggregation
> method that is optimized for fast, short-interval averaging, and makes
> the aggregation frequency adaptive, such that high-frequency updates
> only happen while monitored stall events are actively occurring.
>
> With these patches applied, Android can monitor for, and ward off,
> mounting memory shortages before they cause problems for the user.
> For example, using memory stall monitors in userspace low memory
> killer daemon (lmkd) we can detect mounting pressure and kill less
> important processes before device becomes visibly sluggish. In our
> memory stress testing psi memory monitors produce roughly 10x less
> false positives compared to vmpressure signals. Having ability to
> specify multiple triggers for the same psi metric allows other parts
> of Android framework to monitor memory state of the device and act
> accordingly.
>
> The new interface is straight-forward. The user opens one of the
> pressure files for writing and writes a trigger description into the
> file descriptor that defines the stall state - some or full, and the
> maximum stall time over a given window of time. E.g.:
>
> /* Signal when stall time exceeds 100ms of a 1s window */
> char trigger[] = "full 100000 1000000"
> fd = open("/proc/pressure/memory")
> write(fd, trigger, sizeof(trigger))
> while (poll() >= 0) {
> ...
> };
> close(fd);
>
> When the monitored stall state is entered, psi adapts its aggregation
> frequency according to what the configured time window requires in
> order to emit event signals in a timely fashion. Once the stalling
> subsides, aggregation reverts back to normal.
>
> The trigger is associated with the open file descriptor. To stop
> monitoring, the user only needs to close the file descriptor and the
> trigger is discarded.
>
> Patches 1-6 prepare the psi code for polling support. Patch 7 implements
> the adaptive polling logic, the pressure growth detection optimized for
> short intervals, and hooks up write() and poll() on the pressure files.
>
> The patches were developed in collaboration with Johannes Weiner.
>
> The patches are based on 5.0-rc8 (Merge tag 'drm-next-2019-03-06').
>
> Suren Baghdasaryan (7):
> psi: introduce state_mask to represent stalled psi states
> psi: make psi_enable static
> psi: rename psi fields in preparation for psi trigger addition
> psi: split update_stats into parts
> psi: track changed states
> refactor header includes to allow kthread.h inclusion in psi_types.h
> psi: introduce psi monitor
>
> Documentation/accounting/psi.txt | 107 ++++++
> include/linux/kthread.h | 3 +-
> include/linux/psi.h | 8 +
> include/linux/psi_types.h | 105 +++++-
> include/linux/sched.h | 1 -
> kernel/cgroup/cgroup.c | 71 +++-
> kernel/kthread.c | 1 +
> kernel/sched/psi.c | 613 ++++++++++++++++++++++++++++---
> 8 files changed, 833 insertions(+), 76 deletions(-)
>
> Changes in v5:
> - Fixed sparse: error: incompatible types in comparison expression, as per
> Andrew
> - Changed psi_enable to static, as per Andrew
> - Refactored headers to be able to include kthread.h into psi_types.h
> without creating a circular inclusion, as per Johannes
> - Split psi monitor from aggregator, used RT worker for psi monitoring to
> prevent it being starved by other RT threads and memory pressure events
> being delayed or lost, as per Minchan and Android Performance Team
> - Fixed blockable memory allocation under rcu_read_lock inside
> psi_trigger_poll by using refcounting, as per Eva Huang and Minchan
> - Misc cleanup and improvements, as per Johannes
>
> Notes:
> 0001-psi-introduce-state_mask-to-represent-stalled-psi-st.patch is unchanged
> from the previous version and provided for completeness.
Please fix kbuild test bot's warning in 6/7
Other than that, for all patches,
Acked-by: Minchan Kim <minchan@...nel.org>
Powered by blists - more mailing lists