[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZFKfG7bVuOAk27yP@moria.home.lan>
Date: Wed, 3 May 2023 13:51:23 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Tejun Heo <tj@...nel.org>
Cc: Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
akpm@...ux-foundation.org, vbabka@...e.cz, hannes@...xchg.org,
roman.gushchin@...ux.dev, mgorman@...e.de, dave@...olabs.net,
willy@...radead.org, liam.howlett@...cle.com, corbet@....net,
void@...ifault.com, peterz@...radead.org, juri.lelli@...hat.com,
ldufour@...ux.ibm.com, catalin.marinas@....com, will@...nel.org,
arnd@...db.de, tglx@...utronix.de, mingo@...hat.com,
dave.hansen@...ux.intel.com, x86@...nel.org, peterx@...hat.com,
david@...hat.com, axboe@...nel.dk, mcgrof@...nel.org,
masahiroy@...nel.org, nathan@...nel.org, dennis@...nel.org,
muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
pasha.tatashin@...een.com, yosryahmed@...gle.com,
yuzhao@...gle.com, dhowells@...hat.com, hughd@...gle.com,
andreyknvl@...il.com, keescook@...omium.org,
ndesaulniers@...gle.com, gregkh@...uxfoundation.org,
ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
shakeelb@...gle.com, songmuchun@...edance.com, jbaron@...mai.com,
rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
kernel-team@...roid.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-modules@...r.kernel.org,
kasan-dev@...glegroups.com, cgroups@...r.kernel.org
Subject: Re: [PATCH 00/40] Memory allocation profiling
On Wed, May 03, 2023 at 06:35:49AM -1000, Tejun Heo wrote:
> Hello, Kent.
>
> On Wed, May 03, 2023 at 04:05:08AM -0400, Kent Overstreet wrote:
> > No, we're still waiting on the tracing people to _demonstrate_, not
> > claim, that this is at all possible in a comparable way with tracing.
>
> So, we (meta) happen to do stuff like this all the time in the fleet to hunt
> down tricky persistent problems like memory leaks, ref leaks, what-have-you.
> In recent kernels, with kprobe and BPF, our ability to debug these sorts of
> problems has improved a great deal. Below, I'm attaching a bcc script I used
> to hunt down, IIRC, a double vfree. It's not exactly for a leak but leaks
> can follow the same pattern.
>
> There are of course some pros and cons to this approach:
>
> Pros:
>
> * The framework doesn't really have any runtime overhead, so we can have it
> deployed in the entire fleet and debug wherever problem is.
>
> * It's fully flexible and programmable which enables non-trivial filtering
> and summarizing to be done inside kernel w/ BPF as necessary, which is
> pretty handy for tracking high frequency events.
>
> * BPF is pretty performant. Dedicated built-in kernel code can do better of
> course but BPF's jit compiled code & its data structures are fast enough.
> I don't remember any time this was a problem.
>
> Cons:
>
> * BPF has some learning curve. Also the fact that what it provides is a wide
> open field rather than something scoped out for a specific problem can
> make it seem a bit daunting at the beginning.
>
> * Because tracking starts when the script starts running, it doesn't know
> anything which has happened upto that point, so you gotta pay attention to
> handling e.g. handling frees which don't match allocs. It's kinda annoying
> but not a huge problem usually. There are ways to build in BPF progs into
> the kernel and load it early but I haven't experiemnted with it yet
> personally.
>
> I'm not necessarily against adding dedicated memory debugging mechanism but
> do wonder whether the extra benefits would be enough to justify the code and
> maintenance overhead.
>
> Oh, a bit of delta but for anyone who's more interested in debugging
> problems like this, while I tend to go for bcc
> (https://github.com/iovisor/bcc) for this sort of problems. Others prefer to
> write against libbpf directly or use bpftrace
> (https://github.com/iovisor/bpftrace).
Do you have example output?
TBH I'm skeptical that it's even possible to do full memory allocation
profiling with tracing/bpf, due to recursive memory allocations and
needing an index of outstanding allcations.
Powered by blists - more mailing lists