[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAB=+i9Qz6xpMzV6pOwbdbAC-DBwXvBmUi7Zvjjvi3Yrhf4xX0w@mail.gmail.com>
Date: Fri, 27 Sep 2024 17:01:37 +0900
From: Hyeonggon Yoo <42.hyeyoo@...il.com>
To: zhang fangzheng <fangzheng.zhang1003@...il.com>
Cc: Vlastimil Babka <vbabka@...e.cz>, Fangzheng Zhang <fangzheng.zhang@...soc.com>,
Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>, Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <roman.gushchin@...ux.dev>, Greg KH <gregkh@...uxfoundation.org>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, tkjos@...gle.com,
Yuming Han <yuming.han@...soc.com>, Suren Baghdasaryan <surenb@...gle.com>,
Kent Overstreet <kent.overstreet@...ux.dev>
Subject: Re: [PATCH 0/2] Introduce panic function when slub leaks
On Fri, Sep 27, 2024 at 4:28 PM zhang fangzheng
<fangzheng.zhang1003@...il.com> wrote:
>
> On Thu, Sep 26, 2024 at 8:30 PM Vlastimil Babka <vbabka@...e.cz> wrote:
> >
> > On 9/25/24 15:18, Hyeonggon Yoo wrote:
> > > On Wed, Sep 25, 2024 at 12:23 PM Fangzheng Zhang
> > > <fangzheng.zhang@...soc.com> wrote:
> > >>
> > >> Hi all,
> > >
> > > Hi Fangzheng,
> > >
> > >> A method to detect slub leaks by monitoring its usage in real time
> > >> on the page allocation path of the slub. When the slub occupancy
> > >> exceeds the user-set value, it is considered that the slub is leaking
> > >> at this time
> > >
> > > I'm not sure why this should be a kernel feature. Why not write a user
> > > script that parses
> > > MemTotal: and Slab: part of /proc/meminfo file and generates a log
> > > entry or an alarm?
> >
> > Yes very much agreed. It seems rather arbitrary. Why slab, why not any other
> > kernel-specific counter in /proc/meminfo? Why include NR_SLAB_RECLAIMABLE_B
> > when that's used by caches with shrinkers?
>
> Ok, this is because the current consideration is to specifically
> track the memory usage of the slab module.
> In the stability test, ie, monkey test,
> the anr or reboot problem occurs, there is a high probability
> that the slab occupancy is high when it comes to memory analysis.
> In addition to directly monitoring leaks in the allocation path, it is
> also convenient to record the allocation stack information
> when an exception occurs.
[+Cc Memory Allocation Profiling maintainers]
For recording allocation information, I think CONFIG_MEM_ALLOC_PROFILING [1] [2]
may be used to track allocation sites that contribute to memory leaks,
instead of making the kernel panic or printing WARNING?
.....Or with higher overhead, slub_debug=U [3] if it is not meant to
be run on production.
[1] https://docs.kernel.org/mm/allocation-profiling.html
[2] https://lwn.net/Articles/974380
[3] https://docs.kernel.org/mm/slub.html#debugfs-files-for-slub
Best,
Hyeonggon
> > A userspace solution should be straightforward and universal - easily
> > configurable for different scenarios.
> >
> > >> and a panic operation will be triggered immediately.
> > >
> > > I don't think it would be a good idea to panic unnecessarily.
> > > IMO it is not proper to panic when the kernel can still run.
> >
> > Yes these days it's practically impossible to add a BUG_ON() for more
> > serious conditions than this.
> >
> > Please don't post new versions addressing specific implementation details
> > until this fundamental issue is addressed.
> >
> > Thanks,
> > Vlastimil
> >
> > > Any thoughts?
> > >
> > > Thanks,
> > > Hyeonggon
> >
Powered by blists - more mailing lists