linux-kernel - Re: [PATCH 1/2] mm: memcontrol: flush percpu vmstats before releasing memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190813214643.GA20632@tower.DHCP.thefacebook.com>
Date:   Tue, 13 Aug 2019 21:46:47 +0000
From:   Roman Gushchin <guro@...com>
To:     Andrew Morton <akpm@...ux-foundation.org>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Michal Hocko <mhocko@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/2] mm: memcontrol: flush percpu vmstats before releasing
 memcg

On Tue, Aug 13, 2019 at 02:27:52PM -0700, Andrew Morton wrote:
> On Mon, 12 Aug 2019 15:29:10 -0700 Roman Gushchin <guro@...com> wrote:
> 
> > Percpu caching of local vmstats with the conditional propagation
> > by the cgroup tree leads to an accumulation of errors on non-leaf
> > levels.
> > 
> > Let's imagine two nested memory cgroups A and A/B. Say, a process
> > belonging to A/B allocates 100 pagecache pages on the CPU 0.
> > The percpu cache will spill 3 times, so that 32*3=96 pages will be
> > accounted to A/B and A atomic vmstat counters, 4 pages will remain
> > in the percpu cache.
> > 
> > Imagine A/B is nearby memory.max, so that every following allocation
> > triggers a direct reclaim on the local CPU. Say, each such attempt
> > will free 16 pages on a new cpu. That means every percpu cache will
> > have -16 pages, except the first one, which will have 4 - 16 = -12.
> > A/B and A atomic counters will not be touched at all.
> > 
> > Now a user removes A/B. All percpu caches are freed and corresponding
> > vmstat numbers are forgotten. A has 96 pages more than expected.
> > 
> > As memory cgroups are created and destroyed, errors do accumulate.
> > Even 1-2 pages differences can accumulate into large numbers.
> > 
> > To fix this issue let's accumulate and propagate percpu vmstat
> > values before releasing the memory cgroup. At this point these
> > numbers are stable and cannot be changed.
> > 
> > Since on cpu hotplug we do flush percpu vmstats anyway, we can
> > iterate only over online cpus.
> > 
> > Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
> 
> Is this not serious enough for a cc:stable?

I hope the "Fixes" tag will work, but yeah, my bad, cc:stable is definitely
a good idea here.

Added stable@ to cc.

Thanks!