[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izNdpe7rDm7K4zn4QU-6VqwMwf-LeOJrvXOXhpaikY+tLg@mail.gmail.com>
Date: Fri, 24 Jan 2025 13:00:24 -0800
From: Mina Almasry <almasrymina@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com, andrew+netdev@...n.ch, horms@...nel.org, hawk@...nel.org,
ilias.apalodimas@...aro.org, asml.silence@...il.com, kaiyuanz@...gle.com,
willemb@...gle.com, mkarsten@...terloo.ca, jdamato@...tly.com
Subject: Re: [PATCH net] net: page_pool: don't try to stash the napi id
On Thu, Jan 23, 2025 at 3:16 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Page ppol tried to cache the NAPI ID in page pool info to avoid
Page pool
> having a dependency on the life cycle of the NAPI instance.
> Since commit under Fixes the NAPI ID is not populated until
> napi_enable() and there's a good chance that page pool is
> created before NAPI gets enabled.
>
> Protect the NAPI pointer with the existing page pool mutex,
> the reading path already holds it. napi_id itself we need
The reading paths in page_pool.c don't hold the lock, no? Only the
reading paths in page_pool_user.c seem to do.
I could not immediately wrap my head around why pool->p.napi can be
accessed in page_pool_napi_local with no lock, but needs to be
protected in the code in page_pool_user.c. It seems
READ_ONCE/WRITE_ONCE protection is good enough to make sure
page_pool_napi_local doesn't race with
page_pool_disable_direct_recycling in a way that can crash (the
reading code either sees a valid pointer or NULL). Why is that not
good enough to also synchronize the accesses between
page_pool_disable_direct_recycling and page_pool_nl_fill? I.e., drop
the locking?
Is there some guarantee the napi won't change/get freed while
page_pool_local is running, but can change while page_pool_nl_fill is
running?
> to READ_ONCE(), it's protected by netdev_lock() which are
> not holding in page pool.
>
> Before this patch napi IDs were missing for mlx5:
>
> # ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get
>
> [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912},
> {'id': 143, 'ifindex': 2, 'inflight': 5568, 'inflight-mem': 22806528},
> {'id': 142, 'ifindex': 2, 'inflight': 5120, 'inflight-mem': 20971520},
> {'id': 141, 'ifindex': 2, 'inflight': 4992, 'inflight-mem': 20447232},
> ...
>
> After:
>
> [{'id': 144, 'ifindex': 2, 'inflight': 3072, 'inflight-mem': 12582912,
> 'napi-id': 565},
> {'id': 143, 'ifindex': 2, 'inflight': 4224, 'inflight-mem': 17301504,
> 'napi-id': 525},
> {'id': 142, 'ifindex': 2, 'inflight': 4288, 'inflight-mem': 17563648,
> 'napi-id': 524},
> ...
>
> Fixes: 86e25f40aa1e ("net: napi: Add napi_config")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> CC: hawk@...nel.org
> CC: ilias.apalodimas@...aro.org
> CC: asml.silence@...il.com
> CC: almasrymina@...gle.com
> CC: kaiyuanz@...gle.com
> CC: willemb@...gle.com
> CC: mkarsten@...terloo.ca
> CC: jdamato@...tly.com
> ---
> include/net/page_pool/types.h | 1 -
> net/core/page_pool_priv.h | 2 ++
> net/core/dev.c | 2 +-
> net/core/page_pool.c | 2 ++
> net/core/page_pool_user.c | 15 +++++++++------
> 5 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index ed4cd114180a..7f405672b089 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -237,7 +237,6 @@ struct page_pool {
> struct {
> struct hlist_node list;
> u64 detach_time;
> - u32 napi_id;
> u32 id;
> } user;
> };
> diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h
> index 57439787b9c2..2fb06d5f6d55 100644
> --- a/net/core/page_pool_priv.h
> +++ b/net/core/page_pool_priv.h
> @@ -7,6 +7,8 @@
>
> #include "netmem_priv.h"
>
> +extern struct mutex page_pools_lock;
> +
> s32 page_pool_inflight(const struct page_pool *pool, bool strict);
>
> int page_pool_list(struct page_pool *pool);
> diff --git a/net/core/dev.c b/net/core/dev.c
> index afa2282f2604..07b2bb1ce64f 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6708,7 +6708,7 @@ void napi_resume_irqs(unsigned int napi_id)
> static void __napi_hash_add_with_id(struct napi_struct *napi,
> unsigned int napi_id)
> {
> - napi->napi_id = napi_id;
> + WRITE_ONCE(napi->napi_id, napi_id);
> hlist_add_head_rcu(&napi->napi_hash_node,
> &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]);
> }
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index a3de752c5178..ed0f89373259 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1147,7 +1147,9 @@ void page_pool_disable_direct_recycling(struct page_pool *pool)
> WARN_ON(!test_bit(NAPI_STATE_SCHED, &pool->p.napi->state));
> WARN_ON(READ_ONCE(pool->p.napi->list_owner) != -1);
>
> + mutex_lock(&page_pools_lock);
> WRITE_ONCE(pool->p.napi, NULL);
> + mutex_unlock(&page_pools_lock);
> }
> EXPORT_SYMBOL(page_pool_disable_direct_recycling);
>
> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
> index 48335766c1bf..6677e0c2e256 100644
> --- a/net/core/page_pool_user.c
> +++ b/net/core/page_pool_user.c
> @@ -3,6 +3,7 @@
> #include <linux/mutex.h>
> #include <linux/netdevice.h>
> #include <linux/xarray.h>
> +#include <net/busy_poll.h>
> #include <net/net_debug.h>
> #include <net/netdev_rx_queue.h>
> #include <net/page_pool/helpers.h>
> @@ -14,10 +15,11 @@
> #include "netdev-genl-gen.h"
>
> static DEFINE_XARRAY_FLAGS(page_pools, XA_FLAGS_ALLOC1);
> -/* Protects: page_pools, netdevice->page_pools, pool->slow.netdev, pool->user.
> +/* Protects: page_pools, netdevice->page_pools, pool->p.napi, pool->slow.netdev,
> + * pool->user.
> * Ordering: inside rtnl_lock
> */
> -static DEFINE_MUTEX(page_pools_lock);
> +DEFINE_MUTEX(page_pools_lock);
>
> /* Page pools are only reachable from user space (via netlink) if they are
> * linked to a netdev at creation time. Following page pool "visibility"
> @@ -216,6 +218,7 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
> {
> struct net_devmem_dmabuf_binding *binding = pool->mp_priv;
> size_t inflight, refsz;
> + unsigned int napi_id;
> void *hdr;
>
> hdr = genlmsg_iput(rsp, info);
> @@ -229,8 +232,10 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct page_pool *pool,
> nla_put_u32(rsp, NETDEV_A_PAGE_POOL_IFINDEX,
> pool->slow.netdev->ifindex))
> goto err_cancel;
> - if (pool->user.napi_id &&
> - nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, pool->user.napi_id))
> +
> + napi_id = pool->p.napi ? READ_ONCE(pool->p.napi->napi_id) : 0;
Flowing up on above, I wonder if this can be similar to the code in
page_pool_napi_local to work without the mutex protection:
napi = READ_ONCE(pool->p.napi);
if (napi)
napi_id = READ_ONCE(napi->napi_id);
> + if (napi_id >= MIN_NAPI_ID &&
I think this check is added to filter out 0? Nit: I would check for 0
here, since any non zero napi_id should come from the napi->napi_id,
which should be valid, but not a necessary change.
> + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_NAPI_ID, napi_id))
> goto err_cancel;
>
> inflight = page_pool_inflight(pool, false);
> @@ -319,8 +324,6 @@ int page_pool_list(struct page_pool *pool)
> if (pool->slow.netdev) {
> hlist_add_head(&pool->user.list,
> &pool->slow.netdev->page_pools);
> - pool->user.napi_id = pool->p.napi ? pool->p.napi->napi_id : 0;
> -
> netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_ADD_NTF);
> }
>
> --
> 2.48.1
>
--
Thanks,
Mina
Powered by blists - more mailing lists