linux-kernel - Re: [PATCH net v1 2/2] gve: use max allowed ring size for ZC page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca3899b0-f9b7-4b38-a6fd-a964a1746873@gmail.com>
Date: Mon, 10 Nov 2025 12:36:46 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Dragos Tatulea <dtatulea@...dia.com>
Cc: Mina Almasry <almasrymina@...gle.com>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, Joshua Washington <joshwash@...gle.com>,
 Harshitha Ramamurthy <hramamurthy@...gle.com>,
 Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>, Jesper Dangaard Brouer <hawk@...nel.org>,
 Ilias Apalodimas <ilias.apalodimas@...aro.org>,
 Simon Horman <horms@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
 ziweixiao@...gle.com, Vedant Mathur <vedantmathur@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net v1 2/2] gve: use max allowed ring size for ZC
 page_pools

On 11/7/25 13:35, Dragos Tatulea wrote:
> On Thu, Nov 06, 2025 at 05:18:33PM -0800, Jakub Kicinski wrote:
>> On Thu, 6 Nov 2025 17:25:43 +0000 Dragos Tatulea wrote:
>>> On Wed, Nov 05, 2025 at 06:56:46PM -0800, Mina Almasry wrote:
>>>> On Wed, Nov 5, 2025 at 6:22 PM Jakub Kicinski <kuba@...nel.org> wrote:
>>>>> Increasing cache sizes to the max seems very hacky at best.
>>>>> The underlying implementation uses genpool and doesn't even
>>>>> bother to do batching.
>>>>
>>>> OK, my bad. I tried to think through downsides of arbitrarily
>>>> increasing the ring size in a ZC scenario where the underlying memory
>>>> is pre-pinned and allocated anyway, and I couldn't think of any, but I
>>>> won't argue the point any further.
>>>>    
>>> I see a similar issue with io_uring as well: for a 9K MTU with 4K ring
>>> size there are ~1% allocation errors during a simple zcrx test.
>>>
>>> mlx5 calculates 16K pages and the io_uring zcrx buffer matches exactly
>>> that size (16K * 4K). Increasing the buffer doesn't help because the
>>> pool size is still what the driver asked for (+ also the
>>> internal pool limit). Even worse: eventually ENOSPC is returned to the
>>> application. But maybe this error has a different fix.
>>
>> Hm, yes, did you trace it all the way to where it comes from?
>> page pool itself does not have any ENOSPC AFAICT. If the cache
>> is full we free the page back to the provider via .release_netmem
>>
> Yes I did. It happens in io_cqe_cache_refill() when there are no more
> CQEs:
> https://elixir.bootlin.com/linux/v6.17.7/source/io_uring/io_uring.c#L775

-ENOSPC here means io_uring's CQ got full. It's non-fatal, the user
is expected to process completions and reissue the request. And it's
best to avoid that for performance reasons, e.g. by making the CQ
bigger as you already noted.

-- 
Pavel Begunkov