netdev - Re: [PATCH iwl-next v3] ice: use netif_get_num_default_rss

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <621665db-e881-4adc-8caa-9275a4ed7a50@intel.com>
Date: Thu, 30 Oct 2025 11:39:30 +0100
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>, Paul Menzel
	<pmenzel@...gen.mpg.de>
CC: <intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>,
	<aleksander.lobakin@...el.com>, <jacob.e.keller@...el.com>, "Aleksandr
 Loktionov" <aleksandr.loktionov@...el.com>
Subject: Re: [PATCH iwl-next v3] ice: use netif_get_num_default_rss_queues()

On 10/30/25 10:37, Michal Swiatkowski wrote:
> On Thu, Oct 30, 2025 at 10:10:32AM +0100, Paul Menzel wrote:
>> Dear Michal,
>>
>>
>> Thank you for your patch. For the summary, I’d add:
>>
>> ice: Use netif_get_num_default_rss_queues() to decrease queue number

I would instead just say:
ice: cap the default number of queues to 64

as this is exactly what happens. Then next paragraph could be:
Use netif_get_num_default_rss_queues() as a better base (instead of
the number of CPU cores), but still cap it to 64 to avoid excess IRQs
assigned to PF (what would leave, in some cases, nothing for VFs).

sorry for such late nitpicks
and, see below too

>>
>> Am 30.10.25 um 09:30 schrieb Michal Swiatkowski:
>>> On some high-core systems (like AMD EPYC Bergamo, Intel Clearwater
>>> Forest) loading ice driver with default values can lead to queue/irq
>>> exhaustion. It will result in no additional resources for SR-IOV.
>>
>> Could you please elaborate how to make the queue/irq exhaustion visible?
>>
> 
> What do you mean? On high core system, lets say num_online_cpus()
> returns 288, on 8 ports card we have online 256 irqs per eqch PF (2k in
> total). Driver will load with the 256 queues (and irqs) on each PF.
> Any VFs creation command will fail due to no free irqs available.

this clearly means this is a -net material,
even if this commit will be rather unpleasant for backports to stable

> (echo X > /sys/class/net/ethX/device/sriov_numvfs)
> 
>>> In most cases there is no performance reason for more than half
>>> num_cpus(). Limit the default value to it using generic
>>> netif_get_num_default_rss_queues().
>>>
>>> Still, using ethtool the number of queues can be changed up to
>>> num_online_cpus(). It can be done by calling:
>>> $ethtool -L ethX combined $(nproc)
>>>
>>> This change affects only the default queue amount.
>>
>> How would you judge the regression potential, that means for people where
>> the defaults work good enough, and the queue number is reduced now?
>>
> 
> You can take a look into commit that introduce /2 change in
> netif_get_num_default_rss_queues() [1]. There is a good justification
> for such situation. In short, heaving physical core number is just a
> wasting of CPU resources.
> 
> [1] https://lore.kernel.org/netdev/20220315091832.13873-1-ihuguet@redhat.com/
> 
[...]