lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izNdpe7rDm7K4zn4QU-6VqwMwf-LeOJrvXOXhpaikY+tLg@mail.gmail.com>
Date: Fri, 24 Jan 2025 13:00:24 -0800
From: Mina Almasry <almasrymina@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com, 
	pabeni@...hat.com, andrew+netdev@...n.ch, horms@...nel.org, hawk@...nel.org, 
	ilias.apalodimas@...aro.org, asml.silence@...il.com, kaiyuanz@...gle.com, 
	willemb@...gle.com, mkarsten@...terloo.ca, jdamato@...tly.com
Subject: Re: [PATCH net] net: page_pool: don't try to stash the napi id

On Thu, Jan 23, 2025 at 3:16 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Page ppol tried to cache the NAPI ID in page pool info to avoid

Page pool

> having a dependency on the life cycle of the NAPI instance.
> Since commit under Fixes the NAPI ID is not populated until
> napi_enable() and there's a good chance that page pool is
> created before NAPI gets enabled.
>
> Protect the NAPI pointer with the existing page pool mutex,
> the reading path already holds it. napi_id itself we need

The reading paths in page_pool.c don't hold the lock, no? Only the
reading paths in page_pool_user.c seem to do.

I could not immediately wrap my head around why pool->p.napi can be
accessed in page_pool_napi_local with no lock, but needs to be
protected in the code in page_pool_user.c. It seems
READ_ONCE/WRITE_ONCE protection is good enough to make sure
page_pool_napi_local doesn't race with
page_pool_disable_direct_recycling in a way that can crash (the
reading code either sees a valid pointer or NULL). Why is that not
good enough to also synchronize the accesses between
page_pool_disable_direct_recycling and page_pool_nl_fill? I.e., drop
the locking?

Is there some guarantee the napi won't change/get freed while
page_pool_local is running, but can change while page_pool_nl_fill is
running?

> to READ_ONCE(), it's protected by netdev_lock() which are
> not holding in page pool.
>
> Before this patch napi IDs were missing for mlx5:
>
>  # ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get
>
>  [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912},
>   {'id': 143, 'ifindex': 2, 'inflight': 5568, 'inflight-mem': 22806528},
>   {'id': 142, 'ifindex': 2, 'inflight': 5120, 'inflight-mem': 20971520},
>   {'id': 141, 'ifindex': 2, 'inflight': 4992, 'inflight-mem': 20447232},
>   ...
>
> After:
>
>  [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912,
>    'napi-id': 565},
>   {'id': 143, 'ifindex': 2, 'inflight': 4224, 'inflight-mem': 17301504,
>    'napi-id': 525},
>   {'id': 142, 'ifindex': 2, 'inflight': 4288, 'inflight-mem': 17563648,
>    'napi-id': 524},
>   ...
>
> Fixes: 86e25f40aa1e ("net: napi: Add napi_config")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> CC: hawk@...nel.org
> CC: ilias.apalodimas@...aro.org
> CC: asml.silence@...il.com
> CC: almasrymina@...gle.com
> CC: kaiyuanz@...gle.com
> CC: willemb@...gle.com
> CC: mkarsten@...terloo.ca
> CC: jdamato@...tly.com
> ---
>  include/net/page_pool/types.h |  1 -
>  net/core/page_pool_priv.h     |  2 ++
>  net/core/dev.c                |  2 +-
>  net/core/page_pool.c          |  2 ++
>  net/core/page_pool_user.c     | 15 +++++++++------
>  5 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index ed4cd114180a..7f405672b089 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -237,7 +237,6 @@ struct page_pool {
>         struct {
>                 struct hlist_node list;
>                 u64 detach_time;
> -               u32 napi_id;
>                 u32 id;
>         } user;
>  };
> diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h
> index 57439787b9c2..2fb06d5f6d55 100644
> --- a/net/core/page_pool_priv.h
> +++ b/net/core/page_pool_priv.h
> @@ -7,6 +7,8 @@
>
>  #include "netmem_priv.h"
>
> +extern struct mutex page_pools_lock;
> +
>  s32 page_pool_inflight(const struct page_pool *pool, bool strict);
>
>  int page_pool_list(struct page_pool *pool);
> diff --git a/net/core/dev.c b/net/core/dev.c
> index afa2282f2604..07b2bb1ce64f 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6708,7 +6708,7 @@ void napi_resume_irqs(unsigned int napi_id)
>  static void __napi_hash_add_with_id(struct napi_struct *napi,
>                                     unsigned int napi_id)
>  {
> -       napi->napi_id = napi_id;
> +       WRITE_ONCE(napi->napi_id, napi_id);
>         hlist_add_head_rcu(&napi->napi_hash_node,
>                            &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]);
>  }
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index a3de752c5178..ed0f89373259 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1147,7 +1147,9 @@ void page_pool_disable_direct_recycling(struct page_pool *pool)
>         WARN_ON(!test_bit(NAPI_STATE_SCHED, &pool->p.napi->state));
>         WARN_ON(READ_ONCE(pool->p.napi->list_owner) != -1);
>
> +       mutex_lock(&page_pools_lock);
>         WRITE_ONCE(pool->p.napi, NULL);
> +       mutex_unlock(&page_pools_lock);
>  }
>  EXPORT_SYMBOL(page_pool_disable_direct_recycling);
>
> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
> index 48335766c1bf..6677e0c2e256 100644
> --- a/net/core/page_pool_user.c
> +++ b/net/core/page_pool_user.c
> @@ -3,6 +3,7 @@
>  #include <linux/mutex.h>
>  #include <linux/netdevice.h>
>  #include <linux/xarray.h>
> +#include <net/busy_poll.h>
>  #include <net/net_debug.h>
>  #include <net/netdev_rx_queue.h>
>  #include <net/page_pool/helpers.h>
> @@ -14,10 +15,11 @@
>  #include "netdev-genl-gen.h"
>
>  static DEFINE_XARRAY_FLAGS(page_pools, XA_FLAGS_ALLOC1);
> -/* Protects: page_pools, netdevice->page_pools, pool->slow.netdev, pool->user.
> +/* Protects: page_pools, netdevice->page_pools, pool->p.napi, pool->slow.netdev,
> + *     pool->user.
>   * Ordering: inside rtnl_lock
>   */
> -static DEFINE_MUTEX(page_pools_lock);
> +DEFINE_MUTEX(page_pools_lock);
>
>  /* Page pools are only reachable from user space (via netlink) if they are
>   * linked to a netdev at creation time. Following page pool "visibility"
> @@ -216,6 +218,7 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
>  {
>         struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
>         size_t inflight, refsz;
> +       unsigned int napi_id;
>         void *hdr;
>
>         hdr = genlmsg_iput(rsp, info);
> @@ -229,8 +232,10 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
>             nla_put_u32(rsp, NETDEV_A_PAGE_POOL_IFINDEX,
>                         pool->slow.netdev->ifindex))
>                 goto err_cancel;
> -       if (pool->user.napi_id &&
> -           nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, pool->user.napi_id))
> +
> +       napi_id = pool->p.napi ? READ_ONCE(pool->p.napi->napi_id) : 0;

Flowing up on above, I wonder if this can be similar to the code in
page_pool_napi_local to work without the mutex protection:

napi = READ_ONCE(pool->p.napi);
if (napi)
   napi_id = READ_ONCE(napi->napi_id);

> +       if (napi_id >= MIN_NAPI_ID &&

I think this check is added to filter out 0? Nit: I would check for 0
here, since any non zero napi_id should come from the napi->napi_id,
which should be valid, but not a necessary change.

> +           nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, napi_id))
>                 goto err_cancel;
>
>         inflight = page_pool_inflight(pool, false);
> @@ -319,8 +324,6 @@ int page_pool_list(struct page_pool *pool)
>         if (pool->slow.netdev) {
>                 hlist_add_head(&pool->user.list,
>                                &pool->slow.netdev->page_pools);
> -               pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0;
> -
>                 netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_ADD_NTF);
>         }
>
> --
> 2.48.1
>

-- 
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ