lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <advwinpel3emiq3otlxet2q7k5qwl43urgewhicvqhqliyqpcg@vztzhkqjig6n>
Date: Mon, 9 Jun 2025 17:45:05 -0700
From: Shakeel Butt <shakeel.butt@...ux.dev>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Vlastimil Babka <vbabka@...e.cz>, 
	"Ritesh Harjani (IBM)" <ritesh.list@...il.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>, 
	Michal Hocko <mhocko@...e.com>, david@...hat.com, lorenzo.stoakes@...cle.com, 
	Liam.Howlett@...cle.com, rppt@...nel.org, surenb@...gle.com, donettom@...ux.ibm.com, 
	aboorvad@...ux.ibm.com, sj@...nel.org, linux-mm@...ck.org, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for
 users

On Mon, Jun 09, 2025 at 05:17:58PM -0700, Andrew Morton wrote:
> On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@...e.cz> wrote:
> 
> > On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> > > On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> > >> Baolin Wang <baolin.wang@...ux.alibaba.com> writes:
> > >>
> > >>> On 2025/6/9 15:35, Michal Hocko wrote:
> > >>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> > >>>>>
> > >>>>> Any reason why we dropped the Fixes tag? I see there were a series of
> > >>>>> discussion on v1 and it got concluded that the fix was correct, then why
> > >>>>> drop the fixes tag?
> > >>>>
> > >>>> This seems more like an improvement than a bug fix.
> > >>>
> > >>> Yes. I don't have a strong opinion on this, but we (Alibaba) will 
> > >>> backport it manually,
> > >>>
> > >>> because some of user-space monitoring tools depend 
> > >>> on these statistics.
> > >>
> > >> That sounds like a regression then, isn't it?
> > > 
> > > Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> > > this is making them accurate again, and some userspace depends on it,
> > > then Fixes: and stable is probably warranted then. If this was just a
> > > perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> > > improvement...
> > 
> > Dang, should have re-read the commit log of f1a7941243c1 first. It seems
> > like the error margin due to batching existed also before f1a7941243c1.
> > 
> > " This patch converts the rss_stats into percpu_counter to convert the
> > error  margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
> > 
> > so if on some systems this means worse margin than before, the above
> > "if" chain of thought might still hold.
> 
> f1a7941243c1 seems like a good enough place to tell -stable
> maintainers where to insert the patch (why does this sound rude).
> 
> The patch is simple enough.  I'll add fixes:f1a7941243c1 and cc:stable
> and, as the problem has been there for years, I'll leave the patch in
> mm-unstable so it will eventually get into LTS, in a well tested state.

One thing f1a7941243c1 noted was that the percpu counter conversion
enabled us to get more accurate stats with some cpu cost and in this
patch Baolin has shown that the cpu cost of accurate stats is
reasonable, so seems safe for stable backport. Also it seems like
multiple users are impacted by this issue, so I am fine with stable
backport.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ