[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+kNDJKDFvA6bamTZ8tHTR+NRaV7NbK8sEQREyhwEOsTnroJjw@mail.gmail.com>
Date: Fri, 27 Sep 2024 15:28:20 +0800
From: zhang fangzheng <fangzheng.zhang1003@...il.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Hyeonggon Yoo <42.hyeyoo@...il.com>, Fangzheng Zhang <fangzheng.zhang@...soc.com>,
Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>, Andrew Morton <akpm@...ux-foundation.org>,
Roman Gushchin <roman.gushchin@...ux.dev>, Greg KH <gregkh@...uxfoundation.org>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, tkjos@...gle.com,
Yuming Han <yuming.han@...soc.com>
Subject: Re: [PATCH 0/2] Introduce panic function when slub leaks
On Thu, Sep 26, 2024 at 8:30 PM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 9/25/24 15:18, Hyeonggon Yoo wrote:
> > On Wed, Sep 25, 2024 at 12:23 PM Fangzheng Zhang
> > <fangzheng.zhang@...soc.com> wrote:
> >>
> >> Hi all,
> >
> > Hi Fangzheng,
> >
> >> A method to detect slub leaks by monitoring its usage in real time
> >> on the page allocation path of the slub. When the slub occupancy
> >> exceeds the user-set value, it is considered that the slub is leaking
> >> at this time
> >
> > I'm not sure why this should be a kernel feature. Why not write a user
> > script that parses
> > MemTotal: and Slab: part of /proc/meminfo file and generates a log
> > entry or an alarm?
>
> Yes very much agreed. It seems rather arbitrary. Why slab, why not any other
> kernel-specific counter in /proc/meminfo? Why include NR_SLAB_RECLAIMABLE_B
> when that's used by caches with shrinkers?
Ok, this is because the current consideration is to specifically
track the memory usage of the slab module.
In the stability test, ie, monkey test,
the anr or reboot problem occurs, there is a high probability
that the slab occupancy is high when it comes to memory analysis.
In addition to directly monitoring leaks in the allocation path, it is
also convenient to record the allocation stack information
when an exception occurs.
> A userspace solution should be straightforward and universal - easily
> configurable for different scenarios.
>
> >> and a panic operation will be triggered immediately.
> >
> > I don't think it would be a good idea to panic unnecessarily.
> > IMO it is not proper to panic when the kernel can still run.
>
> Yes these days it's practically impossible to add a BUG_ON() for more
> serious conditions than this.
>
> Please don't post new versions addressing specific implementation details
> until this fundamental issue is addressed.
>
> Thanks,
> Vlastimil
>
> > Any thoughts?
> >
> > Thanks,
> > Hyeonggon
>
Powered by blists - more mailing lists