lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YY6cAFtHOhw2zEc7@cmpxchg.org>
Date:   Fri, 12 Nov 2021 11:53:20 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Brian Chen <brianchen118@...il.com>
Cc:     brianc118@...com, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] psi: fix PSI_MEM_FULL state when tasks are in memstall
 and doing reclaim

On Wed, Nov 10, 2021 at 09:33:12PM +0000, Brian Chen wrote:
> We've noticed cases where tasks in a cgroup are stalled on memory but
> there is little memory FULL pressure since tasks stay on the runqueue
> in reclaim.
> 
> A simple example involves a single threaded program that keeps leaking
> and touching large amounts of memory. It runs in a cgroup with swap
> enabled, memory.high set at 10M and cpu.max ratio set at 5%. Though
> there is significant CPU pressure and memory SOME, there is barely any
> memory FULL since the task enters reclaim and stays on the runqueue.
> However, this memory-bound task is effectively stalled on memory and
> we expect memory FULL to match memory SOME in this scenario.
> 
> The code is confused about memstall && running, thinking there is a
> stalled task and a productive task when there's only one task: a
> reclaimer that's counted as both. To fix this, we redefine the
> condition for PSI_MEM_FULL to check that all running tasks are in an
> active memstall instead of checking that there are no running tasks.
> 
>         case PSI_MEM_FULL:
> -               return unlikely(tasks[NR_MEMSTALL] && !tasks[NR_RUNNING]);
> +               return unlikely(tasks[NR_MEMSTALL] &&
> +                       tasks[NR_RUNNING] == tasks[NR_MEMSTALL_RUNNING]);
> 
> This will capture reclaimers. It will also capture tasks that called
> psi_memstall_enter() and are about to sleep, but this should be
> negligible noise.
> 
> Signed-off-by: Brian Chen <brianchen118@...il.com>

Acked-by: Johannes Weiner <hannes@...xchg.org>

This bug essentially causes us to count memory-some in walltime and
memory-full in tasktime, which can be quite confusing and misleading
in combined CPU and memory pressure situations.

The fix looks good to me, thanks Brian.

The bug's been there since the initial psi commit, so I don't think a
stable backport is warranted.

Peter, absent objections, can you please pick this up through -tip?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ