linux-kernel - Re: [PATCH] Add accumulated call counter for memory allocation profiling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJuCfpHPrVm6WPRRDZTKr+3XrLZnd-BkibzvFUAOBRxE5k=47w@mail.gmail.com>
Date: Thu, 12 Sep 2024 09:12:20 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: David Wang <00107082@....com>, Yu Zhao <yuzhao@...gle.com>
Cc: kent.overstreet@...ux.dev, akpm@...ux-foundation.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] Add accumulated call counter for memory allocation profiling

On Wed, Sep 11, 2024 at 7:28 PM David Wang <00107082@....com> wrote:
>
> At 2024-07-02 05:58:50, "Kent Overstreet" <kent.overstreet@...ux.dev> wrote:
> >On Mon, Jul 01, 2024 at 10:23:32AM GMT, David Wang wrote:
> >> HI Suren,
> >>
> >> At 2024-07-01 03:33:14, "Suren Baghdasaryan" <surenb@...gle.com> wrote:
> >> >On Mon, Jun 17, 2024 at 8:33 AM David Wang <00107082@....com> wrote:
> >> >>
> >> >> Accumulated call counter can be used to evaluate rate
> >> >> of memory allocation via delta(counters)/delta(time).
> >> >> This metrics can help analysis performance behaviours,
> >> >> e.g. tuning cache size, etc.
> >> >
> >> >Sorry for the delay, David.
> >> >IIUC with this counter you can identify the number of allocations ever
> >> >made from a specific code location. Could you please clarify the usage
> >> >a bit more? Is the goal to see which locations are the most active and
> >> >the rate at which allocations are made there? How will that
> >> >information be used?
> >>
> >> Cumulative counters can be sampled with timestamp,  say at T1, a monitoring tool got a sample value V1,
> >> then after sampling interval, at T2,  got a sample value V2. Then the average rate of allocation can be evaluated
> >> via (V2-V1)/(T2-T1). (The accuracy depends on sampling interval)
> >>
> >> This information "may" help identify where the memory allocation is unnecessary frequent,
> >> and  gain some  better performance by making less memory allocation .
> >> The performance "gain" is just a guess, I do not have a valid example.
> >
> >Easier to just run perf...
>
> Hi,
>
> To Kent:
> It is strangely odd to reply to this when I was trying to debug a performance issue for bcachefs :)
>
> Yes it is true that performance bottleneck could be identified by perf tools, but normally perf
> is not continously running (well, there are some continous profiling projects out there).
> And also, memory allocation normally is not the biggest bottleneck,
>  its impact may not easily picked up by perf.
>
> Well, in the case of https://lore.kernel.org/lkml/20240906154354.61915-1-00107082@163.com/,
> the memory allocation is picked up by perf tools though.
> But with this patch, it is easier to spot that memory allocations behavior are quite different:
> When performance were bad, the average rate for
> "fs/bcachefs/io_write.c:113 func:__bio_alloc_page_pool" was 400k/s,
> while when performance were good, rate was only less than 200/s.
>
> (I have a sample tool collecting /proc/allocinfo, and the data is stored in prometheus,
> the rate is calculated and plot via prometheus statement:
> irate(mem_profiling_count_total{file=~"fs/bcachefs.*", func="__bio_alloc_page_pool"}[5m]))
>
> Hope this could be a valid example demonstrating the usefulness of accumulative counters
> of memory allocation for performance issues.

Hi David,
I agree with Kent that this feature should be behind a kconfig flag.
We don't want to impose the overhead to the users who do not need this
feature.
Thanks,
Suren.

>
>
> Thanks
> David
>
>
>