[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87frbimoyd.fsf@linux.dev>
Date: Thu, 16 Oct 2025 16:00:58 -0700
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: JP Kobryn <inwardvessel@...il.com>
Cc: Shakeel Butt <shakeel.butt@...ux.dev>, andrii@...nel.org,
ast@...nel.org, mkoutny@...e.com, yosryahmed@...gle.com,
hannes@...xchg.org, tj@...nel.org, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, bpf@...r.kernel.org, kernel-team@...a.com,
mhocko@...nel.org, muchun.song@...ux.dev
Subject: Re: [PATCH v2 0/2] memcg: reading memcg stats more efficiently
JP Kobryn <inwardvessel@...il.com> writes:
> On 10/15/25 6:10 PM, Roman Gushchin wrote:
>> JP Kobryn <inwardvessel@...il.com> writes:
>>
>>> On 10/15/25 1:46 PM, Shakeel Butt wrote:
>>>> Cc memcg maintainers.
>>>> On Wed, Oct 15, 2025 at 12:08:11PM -0700, JP Kobryn wrote:
>>>>> When reading cgroup memory.stat files there is significant kernel overhead
>>>>> in the formatting and encoding of numeric data into a string buffer. Beyond
>>>>> that, the given user mode program must decode this data and possibly
>>>>> perform filtering to obtain the desired stats. This process can be
>>>>> expensive for programs that periodically sample this data over a large
>>>>> enough fleet.
>>>>>
>>>>> As an alternative to reading memory.stat, introduce new kfuncs that allow
>>>>> fetching specific memcg stats from within cgroup iterator based bpf
>>>>> programs. This approach allows for numeric values to be transferred
>>>>> directly from the kernel to user mode via the mapped memory of the bpf
>>>>> program's elf data section. Reading stats this way effectively eliminates
>>>>> the numeric conversion work needed to be performed in both kernel and user
>>>>> mode. It also eliminates the need for filtering in a user mode program.
>>>>> i.e. where reading memory.stat returns all stats, this new approach allows
>>>>> returning only select stats.
>> It seems like I've most of these functions implemented as part of
>> bpfoom: https://lkml.org/lkml/2025/8/18/1403
>> So I definitely find them useful. Would be nice to merge our
>> efforts.
>
> Sounds great. I see in your series that you allow the kfuncs to accept
> integers as item numbers. Would my approach of using typed enums work
> for you? I wanted to take advantage of libbpf core so that the bpf
> program could gracefully handle cases where a given enumerator is not
> present in a given kernel version. I made use of this in the
> selftests.
Good point, I'm going to change it in the next version, which I'm about
to send out: tomorrow or early next week.
> I'm planning on sending out a v3 so let me know if you would like to see
> any alterations that would align with bpfoom.
I kinda prefer my version regarding taking a memcg argument instead of cgroup
and also regarding naming. I also think it's safer to expose the
rate-limited version of stats flushing function. But I do lack the
node-level statistics (which I don't need)
If it's ok with you, maybe you can rebase your patches on top of my v2
and I can include your patches in the series?
Thanks!
Powered by blists - more mailing lists