lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC_iWjKi0V6wUmutmpjYyjZGkwXef4bxQwcx6o5rytT+-Pj5Eg@mail.gmail.com>
Date: Thu, 9 Nov 2023 10:11:47 +0200
From: Ilias Apalodimas <ilias.apalodimas@...aro.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com, 
	pabeni@...hat.com, almasrymina@...gle.com, hawk@...nel.org
Subject: Re: [PATCH net-next 00/15] net: page_pool: add netlink-based introspection

Hi Jakub,

On Tue, 24 Oct 2023 at 19:02, Jakub Kicinski <kuba@...nel.org> wrote:
>
> This is a new revision of the RFC posted in August:
> https://lore.kernel.org/all/20230816234303.3786178-1-kuba@kernel.org/
> There's been a handful of fixes and tweaks but the overall
> architecture is unchanged.
>
> As a reminder the RFC was posted as the first step towards
> an API which could configure the page pools (GET API as a stepping
> stone for a SET API to come later). I wasn't sure whether we should
> commit to the GET API before the SET takes shape, hence the large
> delay between versions.
>
> Unfortunately, real deployment experience made this series much more
> urgent. We recently started to deploy newer kernels / drivers
> at Meta, making significant use of page pools for the first time.

That's nice and scary at the same time!

> We immediately run into page pool leaks both real and false positive
> warnings. As Eric pointed out/predicted there's no guarantee that
> applications will read / close their sockets so a page pool page
> may be stuck in a socket (but not leaked) forever. This happens
> a lot in our fleet. Most of these are obviously due to application
> bugs but we should not be printing kernel warnings due to minor
> application resource leaks.

Fair enough, I guess you mean 'continuous warnings'?

>
> Conversely the page pool memory may get leaked at runtime, and
> we have no way to detect / track that, unless someone reconfigures
> the NIC and destroys the page pools which leaked the pages.
>
> The solution presented here is to expose the memory use of page
> pools via netlink. This allows for continuous monitoring of memory
> used by page pools, regardless if they were destroyed or not.
> Sample in patch 15 can print the memory use and recycling
> efficiency:
>
> $ ./page-pool
>     eth0[2]     page pools: 10 (zombies: 0)
>                 refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
>                 recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
>

That's reasonable, and the recycling rate is pretty impressive.  Any
idea how that translated to enhancements overall? mem/cpu pressure etc

Thanks
/Ilias

> The main change compared to the RFC is that the API now exposes
> outstanding references and byte counts even for "live" page pools.
> The warning is no longer printed if page pool is accessible via netlink.
>
> Jakub Kicinski (15):
>   net: page_pool: split the page_pool_params into fast and slow
>   net: page_pool: avoid touching slow on the fastpath
>   net: page_pool: factor out uninit
>   net: page_pool: id the page pools
>   net: page_pool: record pools per netdev
>   net: page_pool: stash the NAPI ID for easier access
>   eth: link netdev to page_pools in drivers
>   net: page_pool: add nlspec for basic access to page pools
>   net: page_pool: implement GET in the netlink API
>   net: page_pool: add netlink notifications for state changes
>   net: page_pool: report amount of memory held by page pools
>   net: page_pool: report when page pool was destroyed
>   net: page_pool: expose page pool stats via netlink
>   net: page_pool: mute the periodic warning for visible page pools
>   tools: ynl: add sample for getting page-pool information
>
>  Documentation/netlink/specs/netdev.yaml       | 161 +++++++
>  Documentation/networking/page_pool.rst        |  10 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     |   1 +
>  .../net/ethernet/mellanox/mlx5/core/en_main.c |   1 +
>  drivers/net/ethernet/microsoft/mana/mana_en.c |   1 +
>  include/linux/list.h                          |  20 +
>  include/linux/netdevice.h                     |   4 +
>  include/linux/poison.h                        |   2 +
>  include/net/page_pool/helpers.h               |   8 +-
>  include/net/page_pool/types.h                 |  43 +-
>  include/uapi/linux/netdev.h                   |  36 ++
>  net/core/Makefile                             |   2 +-
>  net/core/netdev-genl-gen.c                    |  52 +++
>  net/core/netdev-genl-gen.h                    |  11 +
>  net/core/page_pool.c                          |  78 ++--
>  net/core/page_pool_priv.h                     |  12 +
>  net/core/page_pool_user.c                     | 414 +++++++++++++++++
>  tools/include/uapi/linux/netdev.h             |  36 ++
>  tools/net/ynl/generated/netdev-user.c         | 419 ++++++++++++++++++
>  tools/net/ynl/generated/netdev-user.h         | 171 +++++++
>  tools/net/ynl/lib/ynl.h                       |   2 +-
>  tools/net/ynl/samples/.gitignore              |   1 +
>  tools/net/ynl/samples/Makefile                |   2 +-
>  tools/net/ynl/samples/page-pool.c             | 147 ++++++
>  24 files changed, 1586 insertions(+), 48 deletions(-)
>  create mode 100644 net/core/page_pool_priv.h
>  create mode 100644 net/core/page_pool_user.c
>  create mode 100644 tools/net/ynl/samples/page-pool.c
>
> --
> 2.41.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ