[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7565219f-cdbc-4ea4-9122-fe81b5363375@gmail.com>
Date: Thu, 13 Feb 2025 22:37:07 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Mina Almasry <almasrymina@...gle.com>, David Wei <dw@...idwei.uk>
Cc: io-uring@...r.kernel.org, netdev@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jesper Dangaard Brouer
<hawk@...nel.org>, David Ahern <dsahern@...nel.org>,
Stanislav Fomichev <stfomichev@...il.com>, Joe Damato <jdamato@...tly.com>,
Pedro Tammela <pctammela@...atatu.com>
Subject: Re: [PATCH net-next v13 04/11] io_uring/zcrx: implement zerocopy
receive pp memory provider
On 2/13/25 20:57, Mina Almasry wrote:
...
>> +static void io_zcrx_scrub(struct io_zcrx_ifq *ifq)
>> +{
>> + struct io_zcrx_area *area = ifq->area;
>> + int i;
>> +
>> + if (!area)
>> + return;
>> +
>> + /* Reclaim back all buffers given to the user space. */
>> + for (i = 0; i < area->nia.num_niovs; i++) {
>> + struct net_iov *niov = &area->nia.niovs[i];
>> + int nr;
>> +
>> + if (!atomic_read(io_get_user_counter(niov)))
>> + continue;
>> + nr = atomic_xchg(io_get_user_counter(niov), 0);
>> + if (nr && !page_pool_unref_netmem(net_iov_to_netmem(niov), nr))
>> + io_zcrx_return_niov(niov);
>
> I assume nr can be > 1?
Right
If it's always 1, then page_pool_put_netmem()
> does the page_pool_unref_netmem() + page_pool_put_unrefed_netmem() a
> bit more succinctly.
...
>> + entries = io_zcrx_rqring_entries(ifq);
>> + entries = min_t(unsigned, entries, PP_ALLOC_CACHE_REFILL - pp->alloc.count);
>> + if (unlikely(!entries)) {
>> + spin_unlock_bh(&ifq->rq_lock);
>> + return;
>> + }
>> +
>> + do {
>> + struct io_uring_zcrx_rqe *rqe = io_zcrx_get_rqe(ifq, mask);
>> + struct io_zcrx_area *area;
>> + struct net_iov *niov;
>> + unsigned niov_idx, area_idx;
>> +
>> + area_idx = rqe->off >> IORING_ZCRX_AREA_SHIFT;
>> + niov_idx = (rqe->off & ~IORING_ZCRX_AREA_MASK) >> PAGE_SHIFT;
>> +
>> + if (unlikely(rqe->__pad || area_idx))
>> + continue;
>
> nit: I believe a lot of the unlikely in the file are redundant. AFAIU
> the compiler always treats the condition inside the if as unlikely by
> default if there is no else statement.
That'd be too presumptious of the compiler. Sections can be reshuffled,
but even without that, the code generation often looks different. The
annotation is in the right place.
...
>> +static netmem_ref io_pp_zc_alloc_netmems(struct page_pool *pp, gfp_t gfp)
>> +{
>> + struct io_zcrx_ifq *ifq = pp->mp_priv;
>> +
>> + /* pp should already be ensuring that */
>> + if (unlikely(pp->alloc.count))
>> + goto out_return;
>> +
>
> As the comment notes, this is a very defensive check that can be
> removed. We pp should never invoke alloc_netmems if it has items in
> the cache.
Maybe I'll kill it in the future, but it might be a good idea to
leave it be as even page_pool.c itself doesn't trust it too much,
see __page_pool_alloc_pages_slow().
>> + io_zcrx_ring_refill(pp, ifq);
>> + if (likely(pp->alloc.count))
>> + goto out_return;
>> +
>> + io_zcrx_refill_slow(pp, ifq);
>> + if (!pp->alloc.count)
>> + return 0;
>> +out_return:
>> + return pp->alloc.cache[--pp->alloc.count];
>> +}
>> +
>> +static bool io_pp_zc_release_netmem(struct page_pool *pp, netmem_ref netmem)
>> +{
>> + struct net_iov *niov;
>> +
>> + if (WARN_ON_ONCE(!netmem_is_net_iov(netmem)))
>> + return false;
>> +
>
> Also a very defensive check that can be removed. There should be no
> way for the pp to release a netmem to the provider that didn't come
Agree, but it's a warning and I don't care about performance
of this chunk to that extent. Maybe we'll remove it later.
> from this provider. netmem should be guaranteed to be a net_iov, and
Not like it matters for now, but I wouldn't say it should be
net_iov, those callback were initially proposed for huge pages.
> also an io_uring net_iov (not dma-buf one), and specifically be a
> net_iov from this particular memory provider.
>
>> + niov = netmem_to_net_iov(netmem);
>> + net_mp_niov_clear_page_pool(niov);
>> + io_zcrx_return_niov_freelist(niov);
>> + return false;
>> +}
>> +
>> +static int io_pp_zc_init(struct page_pool *pp)
>> +{
>> + struct io_zcrx_ifq *ifq = pp->mp_priv;
>> +
>> + if (WARN_ON_ONCE(!ifq))
>> + return -EINVAL;
>> + if (pp->dma_map)
>> + return -EOPNOTSUPP;
>
> This condition should be flipped actually. pp->dma_map should be true,
> otherwise the provider isn't supported.
It's not implemented in this patch, which is why rejected.
You can think of it as an unconditional failure, even though
io_pp_zc_init is not reachable just yet.
> From the netmem.rst docs we require that netmem page_pools are
> configured with PP_FLAG_DMA_MAP.
>
> And we actually check that pp->dma_map == true before invoking
> mp_ops->init(). This code shouldn't be working unless I missed
> something.
>
> Also arguably this check is defensive. The pp should confirm that
Sure, and I have nothing against defensive checks in cold paths
> pp->dma_map is true before invoking any memory provider, you should
> assume it is true here (and the devmem provider doesn't check it
> IIRU).
--
Pavel Begunkov
Powered by blists - more mailing lists