[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHS8izNPLnbZ-M=367k7H6OYts7RXbcDpbrWy_p37=62LsYYcg@mail.gmail.com>
Date: Fri, 21 Mar 2025 16:13:17 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>, Saeed Mahameed <saeedm@...dia.com>, Leon Romanovsky <leon@...nel.org>,
Tariq Toukan <tariqt@...dia.com>, Andrew Lunn <andrew+netdev@...n.ch>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>, Simon Horman <horms@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>, Yonglong Liu <liuyonglong@...wei.com>,
Yunsheng Lin <linyunsheng@...wei.com>, Pavel Begunkov <asml.silence@...il.com>,
Matthew Wilcox <willy@...radead.org>, netdev@...r.kernel.org, bpf@...r.kernel.org,
linux-rdma@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH net-next 2/3] page_pool: Turn dma_sync and dma_sync_cpu
fields into a bitmap
On Fri, Mar 14, 2025 at 3:12 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>
> Change the single-bit booleans for dma_sync into an unsigned long with
> BIT() definitions so that a subsequent patch can write them both with a
> singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu
> side into __page_pool_dma_sync_for_cpu() so it can be disabled for
> non-netmem providers as well.
>
> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
Reviewed-by: Mina Almasry <almasrymina@...gle.com>
> ---
> include/net/page_pool/helpers.h | 6 +++---
> include/net/page_pool/types.h | 8 ++++++--
> net/core/devmem.c | 3 +--
> net/core/page_pool.c | 9 +++++----
> 4 files changed, 15 insertions(+), 11 deletions(-)
>
> diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
> index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644
> --- a/include/net/page_pool/helpers.h
> +++ b/include/net/page_pool/helpers.h
> @@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool,
> const dma_addr_t dma_addr,
> u32 offset, u32 dma_sync_size)
> {
> + if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU))
> + return;
> +
> dma_sync_single_range_for_cpu(pool->p.dev, dma_addr,
> offset + pool->p.offset, dma_sync_size,
> page_pool_get_dma_dir(pool));
> @@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool,
> const netmem_ref netmem, u32 offset,
> u32 dma_sync_size)
> {
> - if (!pool->dma_sync_for_cpu)
> - return;
> -
> __page_pool_dma_sync_for_cpu(pool,
> page_pool_get_dma_addr_netmem(netmem),
> offset, dma_sync_size);
I think moving the check to __page_pool_dma_sync_for_cpu is fine, but
I would have preferred to keep it as-is actually.
I think if we're syncing netmem we should check dma_sync_for_cpu,
because the netmem may not be dma-syncable. But for pages, they will
likely always be dma-syncable. Some driver may have opted to do a perf
optimizations by calling __page_pool_dma_sync_for_cpu on a dma-addr
that it knows came from a page to save some cycles of netmem checking.
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -33,6 +33,10 @@
> #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \
> PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM)
>
> +/* bit values used in pp->dma_sync */
> +#define PP_DMA_SYNC_DEV BIT(0)
> +#define PP_DMA_SYNC_CPU BIT(1)
> +
> /*
> * Fast allocation side cache array/stack
> *
> @@ -175,12 +179,12 @@ struct page_pool {
>
> bool has_init_callback:1; /* slow::init_callback is set */
> bool dma_map:1; /* Perform DMA mapping */
> - bool dma_sync:1; /* Perform DMA sync for device */
> - bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */
> #ifdef CONFIG_PAGE_POOL_STATS
> bool system:1; /* This is a global percpu pool */
> #endif
>
> + unsigned long dma_sync;
> +
> __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN);
> long frag_users;
> netmem_ref frag_page;
> diff --git a/net/core/devmem.c b/net/core/devmem.c
> index 7c6e0b5b6acb55f376ec725dfb71d1f70a4320c3..16e43752566feb510b3e47fbec2d8da0f26a6adc 100644
> --- a/net/core/devmem.c
> +++ b/net/core/devmem.c
> @@ -337,8 +337,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool)
> /* dma-buf dma addresses do not need and should not be used with
> * dma_sync_for_cpu/device. Force disable dma_sync.
> */
> - pool->dma_sync = false;
> - pool->dma_sync_for_cpu = false;
> + pool->dma_sync = 0;
>
> if (pool->p.order != 0)
> return -E2BIG;
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool,
> memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow));
>
> pool->cpuid = cpuid;
> - pool->dma_sync_for_cpu = true;
> + pool->dma_sync = PP_DMA_SYNC_CPU;
>
More pedantically this should have been pool->dma_sync |=
PP_DMA_SYNC_CPU, but it doesn't matter since this variable is 0
initialized I think.
--
Thanks,
Mina
Powered by blists - more mailing lists