[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izPA+hmOkP=jZd3mm1Zux2uaqpOf0poEci-Jn1g7msfkbA@mail.gmail.com>
Date: Wed, 26 Mar 2025 21:59:27 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
Cc: Saeed Mahameed <saeedm@...dia.com>, Toke Høiland-Jørgensen <toke@...hat.com>,
"David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>, Leon Romanovsky <leon@...nel.org>, Tariq Toukan <tariqt@...dia.com>,
Andrew Lunn <andrew+netdev@...n.ch>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Simon Horman <horms@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
Yonglong Liu <liuyonglong@...wei.com>, Pavel Begunkov <asml.silence@...il.com>,
Matthew Wilcox <willy@...radead.org>, netdev@...r.kernel.org, bpf@...r.kernel.org,
linux-rdma@...r.kernel.org, linux-mm@...ck.org, Qiuling Ren <qren@...hat.com>,
Yuying Ma <yuma@...hat.com>
Subject: Re: [PATCH net-next v2 3/3] page_pool: Track DMA-mapped pages and
unmap them when destroying the pool
On Wed, Mar 26, 2025 at 8:54 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
> >>
> >> Since all the tracking added in this patch is performed on DMA
> >> map/unmap, no additional code is needed in the fast path, meaning the
> >> performance overhead of this tracking is negligible there. A
> >> micro-benchmark shows that the total overhead of the tracking itself is
> >> about 400 ns (39 cycles(tsc) 395.218 ns; sum for both map and unmap[2]).
> >> Since this cost is only paid on DMA map and unmap, it seems like an
> >> acceptable cost to fix the late unmap issue. Further optimisation can
> >> narrow the cases where this cost is paid (for instance by eliding the
> >> tracking when DMA map/unmap is a no-op).
> >>
> > What I am missing here, what is the added cost of those extra operations on
> > the slow path compared to before this patch? Total overhead being
> > acceptable doesn't justify the change, we need diff before and after.
>
> Toke used my data in [2] below:
> The above 400ns is the added cost of those extra operations on the slow path,
> before this patch the slow path only cost about 170ns, so there is more than
> 200% performance degradation for the page tracking in this patch, which I
> failed to see why it is acceptable:(
>
You may be correct about the absolute value of the overhead added
(400ns), I'm not sure it's a 200% regression though.
what time_bench_page_pool03_slow actually does each iteration:
- Allocates a page *from the fast path*
- Frees a page to through the slow path (recycling disabled).
Notably it doesn't do anything in the slow path that I imagine is
actually expensive: alloc_page, dma_map_page, & dma_unmap_page.
We do not have an existing benchmark case that actually tests the full
cost of the slow path (i.e full cost of page_pool_alloc from slow path
with dma-mapping and page_pool_put_page to the slow path with
dma-unmapping). That test case would have given us the full picture in
terms of % regression.
This is partly why I want to upstream the benchmark. Such cases can be
added after it is upstreamed.
--
Thanks,
Mina
Powered by blists - more mailing lists