[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250916171040.12436-1-ajgja@amazon.com>
Date: Tue, 16 Sep 2025 17:10:40 +0000
From: Andrew Guerrero <ajgja@...zon.com>
To: <gregkh@...uxfoundation.org>
CC: <ajgja@...zon.com>, <akpm@...ux-foundation.org>,
<cgroups@...r.kernel.org>, <gunnarku@...zon.com>, <guro@...com>,
<hannes@...xchg.org>, <linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<mhocko@...nel.org>, <muchun.song@...ux.dev>, <roman.gushchin@...ux.dev>,
<shakeel.butt@...ux.dev>, <stable@...r.kernel.org>, <vdavydov.dev@...il.com>
Subject: Re: [PATCH] mm: memcontrol: fix memcg accounting during cpu hotplug
On 2025-09-12 12:45 UTC, Greg KH wrote:
> On Mon, Sep 08, 2025 at 09:09:00PM +0000, Andrew Guerrero wrote:
> > On 2025-09-07 13:10 UTC, Greg KH wrote:
> > > On Sat, Sep 06, 2025 at 03:21:08AM +0000, Andrew Guerrero wrote:
> > > > This patch is intended for the 5.10 longterm release branch. It will not apply
> > > > cleanly to mainline and is inadvertantly fixed by a larger series of changes in
> > > > later release branches:
> > > > a3d4c05a4474 ("mm: memcontrol: fix cpuhotplug statistics flushing").
> > >
> > > Why can't we take those instead?
> > >
> > > > In 5.15, the counter flushing code is completely removed. This may be another
> > > > viable option here too, though it's a larger change.
> > >
> > > If it's not needed anymore, why not just remove it with the upstream
> > > commits as well?
> >
> > Yeah, my understanding is the typical flow is to pull commits from upstream into
> > stable branches. However, I'm not confident I know the the answer to "which
> > upstream commits?" To get started,
> >
> > `git log -L :memcg_hotplug_cpu_dead:mm/memcontrol.c linux-5.10.y..linux-5.15.y`
> >
> > tells me that the upstream changes to pull are:
> >
> > - https://lore.kernel.org/all/20210209163304.77088-1-hannes@cmpxchg.org/T/#u
> > - https://lore.kernel.org/all/20210716212137.1391164-1-shakeelb@google.com/T/#u
> >
> > However, these are substantial features that "fix" the issue indirectly by
> > transitioning the memcg accounting system over to rstats. I can pick these 10
> > upstream commits, but I'm worried I may overlook some additional patches from
> > 5.15.y that need to go along with them. I may need some guidance if we go this
> > route.
>
> Testing is key :)
>
> > Another reasonable option is to take neither route. We can maintain this patch
> > internally and then drop it once we upgrade to a new kernel version.
>
> Perhaps just do that for now if you all are hitting this issue? It
> seems to be the only report I've seen so far.
We are hitting this issue only in a stress test, and I think we got lucky with
experiencing it, so I wouldn't be too surprised if this is the first and only
report.
Thanks for taking a look!
Andrew
Powered by blists - more mailing lists