[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3ab771d9f1332d44e7931eb8f8fe9d8d4be10a9b.camel@nvidia.com>
Date: Thu, 9 Nov 2023 17:05:31 +0000
From: Dragos Tatulea <dtatulea@...dia.com>
To: "kuba@...nel.org" <kuba@...nel.org>, "davem@...emloft.net"
<davem@...emloft.net>
CC: "ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
"edumazet@...gle.com" <edumazet@...gle.com>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "hawk@...nel.org" <hawk@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>, "almasrymina@...gle.com"
<almasrymina@...gle.com>
Subject: Re: [PATCH net-next 12/15] net: page_pool: report when page pool was
destroyed
On Tue, 2023-10-24 at 09:02 -0700, Jakub Kicinski wrote:
> Report when page pool was destroyed. Together with the inflight
> / memory use reporting this can serve as a replacement for the
> warning about leaked page pools we currently print to dmesg.
>
> Example output for a fake leaked page pool using some hacks
> in netdevsim (one "live" pool, and one "leaked" on the same dev):
>
> $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \
> --dump page-pool-get
> [{'id': 2, 'ifindex': 3},
> {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}]
>
The destroyed ts really helps to narrow down which tests/test ranges are
triggering the leaks. Thanks!
I was planning to add a per page_pool "name" where the driver would encode the
rq index + creation timestamp and the printout would show this name. This
approach is much cleaner though.
For what it's worth:
Tested-by: Dragos Tatulea <dtatulea@...dia.com>
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> Documentation/netlink/specs/netdev.yaml | 9 +++++++++
> include/net/page_pool/types.h | 1 +
> include/uapi/linux/netdev.h | 1 +
> net/core/page_pool.c | 1 +
> net/core/page_pool_priv.h | 1 +
> net/core/page_pool_user.c | 12 ++++++++++++
> 6 files changed, 25 insertions(+)
>
> diff --git a/Documentation/netlink/specs/netdev.yaml
> b/Documentation/netlink/specs/netdev.yaml
> index 8d995760a14a..8be8f249bed3 100644
> --- a/Documentation/netlink/specs/netdev.yaml
> +++ b/Documentation/netlink/specs/netdev.yaml
> @@ -125,6 +125,14 @@ name: netdev
> type: uint
> doc: |
> Amount of memory held by inflight pages.
> + -
> + name: destroyed
> + type: uint
> + doc: |
> + Seconds in CLOCK_BOOTTIME of when Page Pool was destroyed.
> + Page Pools wait for all the memory allocated from them to be freed
> + before truly disappearing.
> + Absent if Page Pool hasn't been destroyed.
>
> operations:
> list:
> @@ -176,6 +184,7 @@ name: netdev
> - napi-id
> - inflight
> - inflight-mem
> + - destroyed
> dump:
> reply: *pp-reply
> -
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index 7e47d7bb2c1e..f0c51ef5e345 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -193,6 +193,7 @@ struct page_pool {
> /* User-facing fields, protected by page_pools_lock */
> struct {
> struct hlist_node list;
> + u64 destroyed;
> u32 napi_id;
> u32 id;
> } user;
> diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h
> index 26ae5bdd3187..e5bf66d2aa31 100644
> --- a/include/uapi/linux/netdev.h
> +++ b/include/uapi/linux/netdev.h
> @@ -70,6 +70,7 @@ enum {
> NETDEV_A_PAGE_POOL_NAPI_ID,
> NETDEV_A_PAGE_POOL_INFLIGHT,
> NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
> + NETDEV_A_PAGE_POOL_DESTROYED,
>
> __NETDEV_A_PAGE_POOL_MAX,
> NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 30c8fc91fa66..57847fbb76a0 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -949,6 +949,7 @@ void page_pool_destroy(struct page_pool *pool)
> if (!page_pool_release(pool))
> return;
>
> + page_pool_destroyed(pool);
> pool->defer_start = jiffies;
> pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
>
> diff --git a/net/core/page_pool_priv.h b/net/core/page_pool_priv.h
> index 72fb21ea1ddc..7fe6f842a270 100644
> --- a/net/core/page_pool_priv.h
> +++ b/net/core/page_pool_priv.h
> @@ -6,6 +6,7 @@
> s32 page_pool_inflight(const struct page_pool *pool, bool strict);
>
> int page_pool_list(struct page_pool *pool);
> +void page_pool_destroyed(struct page_pool *pool);
> void page_pool_unlist(struct page_pool *pool);
>
> #endif
> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
> index c971fe9eeb01..1fb5c3cbe412 100644
> --- a/net/core/page_pool_user.c
> +++ b/net/core/page_pool_user.c
> @@ -134,6 +134,10 @@ page_pool_nl_fill(struct sk_buff *rsp, const struct
> page_pool *pool,
> nla_put_uint(rsp, NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
> inflight * refsz))
> goto err_cancel;
> + if (pool->user.destroyed &&
> + nla_put_uint(rsp, NETDEV_A_PAGE_POOL_DESTROYED,
> + pool->user.destroyed))
> + goto err_cancel;
>
> genlmsg_end(rsp, hdr);
>
> @@ -219,6 +223,14 @@ int page_pool_list(struct page_pool *pool)
> return err;
> }
>
> +void page_pool_destroyed(struct page_pool *pool)
> +{
> + mutex_lock(&page_pools_lock);
> + pool->user.destroyed = ktime_get_boottime_seconds();
> + netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_CHANGE_NTF);
> + mutex_unlock(&page_pools_lock);
> +}
> +
> void page_pool_unlist(struct page_pool *pool)
> {
> mutex_lock(&page_pools_lock);
Powered by blists - more mailing lists