[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC_iWjKi0V6wUmutmpjYyjZGkwXef4bxQwcx6o5rytT+-Pj5Eg@mail.gmail.com>
Date: Thu, 9 Nov 2023 10:11:47 +0200
From: Ilias Apalodimas <ilias.apalodimas@...aro.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, almasrymina@...gle.com, hawk@...nel.org
Subject: Re: [PATCH net-next 00/15] net: page_pool: add netlink-based introspection
Hi Jakub,
On Tue, 24 Oct 2023 at 19:02, Jakub Kicinski <kuba@...nel.org> wrote:
>
> This is a new revision of the RFC posted in August:
> https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/
> There's been a handful of fixes and tweaks but the overall
> architecture is unchanged.
>
> As a reminder the RFC was posted as the first step towards
> an API which could configure the page pools (GET API as a stepping
> stone for a SET API to come later). I wasn't sure whether we should
> commit to the GET API before the SET takes shape, hence the large
> delay between versions.
>
> Unfortunately, real deployment experience made this series much more
> urgent. We recently started to deploy newer kernels / drivers
> at Meta, making significant use of page pools for the first time.
That's nice and scary at the same time!
> We immediately run into page pool leaks both real and false positive
> warnings. As Eric pointed out/predicted there's no guarantee that
> applications will read / close their sockets so a page pool page
> may be stuck in a socket (but not leaked) forever. This happens
> a lot in our fleet. Most of these are obviously due to application
> bugs but we should not be printing kernel warnings due to minor
> application resource leaks.
Fair enough, I guess you mean 'continuous warnings'?
>
> Conversely the page pool memory may get leaked at runtime, and
> we have no way to detect / track that, unless someone reconfigures
> the NIC and destroys the page pools which leaked the pages.
>
> The solution presented here is to expose the memory use of page
> pools via netlink. This allows for continuous monitoring of memory
> used by page pools, regardless if they were destroyed or not.
> Sample in patch 15 can print the memory use and recycling
> efficiency:
>
> $ ./page-pool
> eth0[2] page pools: 10 (zombies: 0)
> refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
> recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
>
That's reasonable, and the recycling rate is pretty impressive. Any
idea how that translated to enhancements overall? mem/cpu pressure etc
Thanks
/Ilias
> The main change compared to the RFC is that the API now exposes
> outstanding references and byte counts even for "live" page pools.
> The warning is no longer printed if page pool is accessible via netlink.
>
> Jakub Kicinski (15):
> net: page_pool: split the page_pool_params into fast and slow
> net: page_pool: avoid touching slow on the fastpath
> net: page_pool: factor out uninit
> net: page_pool: id the page pools
> net: page_pool: record pools per netdev
> net: page_pool: stash the NAPI ID for easier access
> eth: link netdev to page_pools in drivers
> net: page_pool: add nlspec for basic access to page pools
> net: page_pool: implement GET in the netlink API
> net: page_pool: add netlink notifications for state changes
> net: page_pool: report amount of memory held by page pools
> net: page_pool: report when page pool was destroyed
> net: page_pool: expose page pool stats via netlink
> net: page_pool: mute the periodic warning for visible page pools
> tools: ynl: add sample for getting page-pool information
>
> Documentation/netlink/specs/netdev.yaml | 161 +++++++
> Documentation/networking/page_pool.rst | 10 +-
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 1 +
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 1 +
> drivers/net/ethernet/microsoft/mana/mana_en.c | 1 +
> include/linux/list.h | 20 +
> include/linux/netdevice.h | 4 +
> include/linux/poison.h | 2 +
> include/net/page_pool/helpers.h | 8 +-
> include/net/page_pool/types.h | 43 +-
> include/uapi/linux/netdev.h | 36 ++
> net/core/Makefile | 2 +-
> net/core/netdev-genl-gen.c | 52 +++
> net/core/netdev-genl-gen.h | 11 +
> net/core/page_pool.c | 78 ++--
> net/core/page_pool_priv.h | 12 +
> net/core/page_pool_user.c | 414 +++++++++++++++++
> tools/include/uapi/linux/netdev.h | 36 ++
> tools/net/ynl/generated/netdev-user.c | 419 ++++++++++++++++++
> tools/net/ynl/generated/netdev-user.h | 171 +++++++
> tools/net/ynl/lib/ynl.h | 2 +-
> tools/net/ynl/samples/.gitignore | 1 +
> tools/net/ynl/samples/Makefile | 2 +-
> tools/net/ynl/samples/page-pool.c | 147 ++++++
> 24 files changed, 1586 insertions(+), 48 deletions(-)
> create mode 100644 net/core/page_pool_priv.h
> create mode 100644 net/core/page_pool_user.c
> create mode 100644 tools/net/ynl/samples/page-pool.c
>
> --
> 2.41.0
>
Powered by blists - more mailing lists