[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231109081412.161ce68f@kernel.org>
Date: Thu, 9 Nov 2023 08:14:12 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Ilias Apalodimas <ilias.apalodimas@...aro.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, almasrymina@...gle.com, hawk@...nel.org
Subject: Re: [PATCH net-next 00/15] net: page_pool: add netlink-based
introspection
On Thu, 9 Nov 2023 10:11:47 +0200 Ilias Apalodimas wrote:
> > We immediately run into page pool leaks both real and false positive
> > warnings. As Eric pointed out/predicted there's no guarantee that
> > applications will read / close their sockets so a page pool page
> > may be stuck in a socket (but not leaked) forever. This happens
> > a lot in our fleet. Most of these are obviously due to application
> > bugs but we should not be printing kernel warnings due to minor
> > application resource leaks.
>
> Fair enough, I guess you mean 'continuous warnings'?
Yes, in this case but I'm making a general statement.
Or do you mean that there's a typo / grammar issue?
> > Conversely the page pool memory may get leaked at runtime, and
> > we have no way to detect / track that, unless someone reconfigures
> > the NIC and destroys the page pools which leaked the pages.
> >
> > The solution presented here is to expose the memory use of page
> > pools via netlink. This allows for continuous monitoring of memory
> > used by page pools, regardless if they were destroyed or not.
> > Sample in patch 15 can print the memory use and recycling
> > efficiency:
> >
> > $ ./page-pool
> > eth0[2] page pools: 10 (zombies: 0)
> > refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
> > recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
>
> That's reasonable, and the recycling rate is pretty impressive.
This is just from a test machine, fresh boot, maybe a short iperf run,
I don't remember now :) In any case not real workload.
> Any idea how that translated to enhancements overall? mem/cpu pressure etc
I haven't collected much prod data at this stage, I'm hoping to add
this to the internal kernel and then do a more thorough investigation.
Powered by blists - more mailing lists