[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YY6cAFtHOhw2zEc7@cmpxchg.org>
Date: Fri, 12 Nov 2021 11:53:20 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Brian Chen <brianchen118@...il.com>
Cc: brianc118@...com, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] psi: fix PSI_MEM_FULL state when tasks are in memstall
and doing reclaim
On Wed, Nov 10, 2021 at 09:33:12PM +0000, Brian Chen wrote:
> We've noticed cases where tasks in a cgroup are stalled on memory but
> there is little memory FULL pressure since tasks stay on the runqueue
> in reclaim.
>
> A simple example involves a single threaded program that keeps leaking
> and touching large amounts of memory. It runs in a cgroup with swap
> enabled, memory.high set at 10M and cpu.max ratio set at 5%. Though
> there is significant CPU pressure and memory SOME, there is barely any
> memory FULL since the task enters reclaim and stays on the runqueue.
> However, this memory-bound task is effectively stalled on memory and
> we expect memory FULL to match memory SOME in this scenario.
>
> The code is confused about memstall && running, thinking there is a
> stalled task and a productive task when there's only one task: a
> reclaimer that's counted as both. To fix this, we redefine the
> condition for PSI_MEM_FULL to check that all running tasks are in an
> active memstall instead of checking that there are no running tasks.
>
> case PSI_MEM_FULL:
> - return unlikely(tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]);
> + return unlikely(tasks[NR_MEMSTALL] &&
> + tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
>
> This will capture reclaimers. It will also capture tasks that called
> psi_memstall_enter() and are about to sleep, but this should be
> negligible noise.
>
> Signed-off-by: Brian Chen <brianchen118@...il.com>
Acked-by: Johannes Weiner <hannes@...xchg.org>
This bug essentially causes us to count memory-some in walltime and
memory-full in tasktime, which can be quite confusing and misleading
in combined CPU and memory pressure situations.
The fix looks good to me, thanks Brian.
The bug's been there since the initial psi commit, so I don't think a
stable backport is warranted.
Peter, absent objections, can you please pick this up through -tip?
Powered by blists - more mailing lists