lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2f43bdf7-5ce0-4835-9e60-39d91f637152@huaweicloud.com>
Date: Wed, 12 Nov 2025 08:56:28 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Leon Huang Fu <leon.huangfu@...pee.com>
Cc: akpm@...ux-foundation.org, cgroups@...r.kernel.org, corbet@....net,
 hannes@...xchg.org, jack@...e.cz, joel.granados@...nel.org,
 kyle.meyer@....com, lance.yang@...ux.dev, laoar.shao@...il.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 mclapinski@...gle.com, mhocko@...nel.org, mkoutny@...e.com,
 muchun.song@...ux.dev, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
 tj@...nel.org
Subject: Re: [PATCH mm-new v3] mm/memcontrol: Add memory.stat_refresh for
 on-demand stats flushing



On 2025/11/11 14:44, Leon Huang Fu wrote:
> On Tue, Nov 11, 2025 at 9:00 AM Chen Ridong <chenridong@...weicloud.com> wrote:
>>
>>
>>
>> On 2025/11/10 21:50, Michal Koutný wrote:
>>> Hello Leon.
> 
> Hi Ridong,
> 
>>>
>>> On Mon, Nov 10, 2025 at 06:19:48PM +0800, Leon Huang Fu <leon.huangfu@...pee.com> wrote:
>>>> Memory cgroup statistics are updated asynchronously with periodic
>>>> flushing to reduce overhead. The current implementation uses a flush
>>>> threshold calculated as MEMCG_CHARGE_BATCH * num_online_cpus() for
>>>> determining when to aggregate per-CPU memory cgroup statistics. On
>>>> systems with high core counts, this threshold can become very large
>>>> (e.g., 64 * 256 = 16,384 on a 256-core system), leading to stale
>>>> statistics when userspace reads memory.stat files.
>>>>
>>
>> We have encountered this problem multiple times when running LTP tests. It can easily occur when
>> using a 64K page size.
>>
>> error:
>>         memcg_stat_rss 10 TFAIL: rss is 0, 266240 expected
>>
> 
> Have you encountered this problem in real world?
> 
Do you mean whether we’ve encountered this issue in our product? We haven’t so far.

However, this fails the LTP test quite easily. The error logs come directly from LTP. The issue
occurs because the threshold isn’t reached, resulting in an RSS value of 0. We tried increasing the
memory allocated by the LTP case, but that wasn’t the right solution.

>>>> This is particularly problematic for monitoring and management tools
>>>> that rely on reasonably fresh statistics, as they may observe data
>>>> that is thousands of updates out of date.
>>>>
>>>> Introduce a new write-only file, memory.stat_refresh, that allows
>>>> userspace to explicitly trigger an immediate flush of memory statistics.
>>>
> [...]
>>>
>>> Next, v1 and v2 haven't been consistent since introduction of v2 (unlike
>>> some other controllers that share code or even cftypes between v1 and
>>> v2). So I'd avoid introducing a new file to V1 API.
>>>
>>
>> We encountered this problem in v1, I think this is a common problem should be fixed.
> 
> Thanks for pointing that out.
> 
> Thanks,
> Leon
> 
> [...]

-- 
Best regards,
Ridong


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ