lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87zfn6c2f1.fsf@kurt.kurt.home>
Date: Mon, 14 Oct 2024 14:08:34 +0200
From: Kurt Kanzenbach <kurt@...utronix.de>
To: Joe Damato <jdamato@...tly.com>
Cc: netdev@...r.kernel.org, Tony Nguyen <anthony.l.nguyen@...el.com>,
 Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
 <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, "moderated list:INTEL
 ETHERNET DRIVERS" <intel-wired-lan@...ts.osuosl.org>, open list
 <linux-kernel@...r.kernel.org>
Subject: Re: [RFC net-next 2/2] igc: Link queues to NAPI instances

On Fri Oct 11 2024, Joe Damato wrote:
>> > 16 core Intel(R) Core(TM) i7-1360P
>> >
>> > lspci:
>> > Ethernet controller: Intel Corporation Device 125c (rev 04)
>> >                      Subsystem: Intel Corporation Device 3037
>> >
>> > ethtool -i:
>> > firmware-version: 2017:888d
>> >
>> > $ sudo ethtool -L enp86s0 combined 2
>> > $ sudo ethtool -l enp86s0
>> > Channel parameters for enp86s0:
>> > Pre-set maximums:
>> > RX:		n/a
>> > TX:		n/a
>> > Other:		1
>> > Combined:	4
>> > Current hardware settings:
>> > RX:		n/a
>> > TX:		n/a
>> > Other:		1
>> > Combined:	2
>> >
>> > $ cat /proc/interrupts | grep enp86s0 | cut --delimiter=":" -f1
>> >  144
>> >  145
>> >  146
>> >  147
>> >  148
>> >
>> > Note that IRQ 144 is the "other" IRQ, so if we ignore that one...
>> > /proc/interrupts shows 4 IRQs, despite there being only 2 queues.
>> >
>> > Querying netlink to see which IRQs map to which NAPIs:
>> >
>> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>> >                          --dump napi-get --json='{"ifindex": 2}'
>> > [{'id': 8200, 'ifindex': 2, 'irq': 148},
>> >  {'id': 8199, 'ifindex': 2, 'irq': 147},
>> >  {'id': 8198, 'ifindex': 2, 'irq': 146},
>> >  {'id': 8197, 'ifindex': 2, 'irq': 145}]
>> >
>> > This suggests that all 4 IRQs are assigned to a NAPI (this mapping
>> > happens due to netif_napi_set_irq in patch 1).
>> >
>> > Now query the queues and which NAPIs they are associated with (which
>> > is what patch 2 adds):
>> >
>> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ 
>> >                          --dump queue-get --json='{"ifindex": 2}'
>> > [{'id': 0, 'ifindex': 2, 'napi-id': 8197, 'type': 'rx'},
>> >  {'id': 1, 'ifindex': 2, 'napi-id': 8198, 'type': 'rx'},
>> >  {'id': 0, 'ifindex': 2, 'napi-id': 8197, 'type': 'tx'},
>> >  {'id': 1, 'ifindex': 2, 'napi-id': 8198, 'type': 'tx'}]
>> >
>> > As you can see above, since the queues are combined and there are
>> > only 2 of them, NAPI IDs 8197 and 8198 (which are triggered via IRQ
>> > 145 and 146) are displayed.
>> 
>> Is that really correct?
>
> So I definitely think the case where IGC_FLAG_QUEUE_PAIRS is enabled is
> correct, that case is highlighted by the original commit message.

Yes.

>
> I think IGC_FLAG_QUEUE_PAIRS disabled was buggy, as you pointed out, and I've
> made a change I'll include in the next RFC, which I believe fixes it.

Great, thanks :).

>
> Please see below for the case where IGC_FLAG_QUEUE_PAIRS is disabled and a
> walk-through.
>
>> There are four NAPI IDs which are triggered by
>> the four IRQs.
>
> I'm not an IGC expert and I appreciate your review/comments very much, so thank
> you!
>
> I don't think the number of queues I create with ethtool factors into whether
> or not IGC_FLAG_QUEUE_PAIRS is enabled or not.

igc_ethtool_set_channels() sets adapter->rss_queues and calls
igc_set_flag_queue_pairs(). So, ethtool should influence it.

> Please forgive me for the length of my message, but let me walk
> through the code to see if I've gotten it right, including some debug
> output I added:
>
> In igc_init_queue_configuration:
>
> max_rss_queues = IGC_MAX_RX_QUEUES (4)
>
> and
>
> adapter->rss_queues = min of 4 or num_online_cpus
>
> which I presume is 16 on my 16 core machine, so:
>
> adapter->rss_queues = 4 (see below for debug output which verifies this)
>
> In igc_set_flag_queue_pairs, the flag IGC_FLAG_QUEUE_PAIRS is set only if:
>
> (adapter->rss_queues (4) > max_rss_queues(4) / 2) which simplifies
> to (4 > 2), meaning the flag would be enabled regardless of the
> number of queues I create with ethtool, as long as I boot my machine
> with 16 cores available.
>
> I verified this by adding debug output to igc_set_flag_queue_pairs and
> igc_init_queue_configuration, which outputs:
>
> igc 0000:56:00.0: IGC_FLAG_QUEUE_PAIRS on
> igc 0000:56:00.0: max_rss_queues: 4, rss_queues: 4
>
> That's at boot with the default number of combined queues of 4 (which is also
> the hardware max).
>
> The result of IGC_FLAG_QUEUE_PAIRS on was the result posted in the
> original commit message of this patch and I believe that to be
> correct.
>
> The only place I can see that IGC_FLAG_QUEUE_PAIRS has any impact
> (aside from ethtool IRQ coalescing, which we can ignore) is
> igc_set_interrupt_capability:
>
>   /* start with one vector for every Rx queue */
>   numvecs = adapter->num_rx_queues;
>   
>   /* if Tx handler is separate add 1 for every Tx queue */
>   if (!(adapter->flags & IGC_FLAG_QUEUE_PAIRS))
>     numvecs += adapter->num_tx_queues;
>
> In this case, the flag only has impact if it is _off_.
>
> It impacts the number of vectors allocated, so I made a small change
> to the driver, which I'll include in the next RFC to deal with the
> IGC_FLAG_QUEUE_PAIRS off case.
>
> In order to get IGC_FLAG_QUEUE_PAIRS off, I boot my machine with the grub
> command line option "maxcpus=2", which should force the flag off.
>
> Checking my debug output at boot to make sure:
>
> igc 0000:56:00.0: IGC_FLAG_QUEUE_PAIRS off
> igc 0000:56:00.0: max_rss_queues: 4, rss_queues: 2
>
> So, now IGC_FLAG_QUEUE_PAIRS is off which should impact
> igc_set_interrupt_capability and the vector calculation.
>
> Let's check how things look at boot:
>
> $ ethtool -l enp86s0 | tail -5
> Current hardware settings:
> RX:		n/a
> TX:		n/a
> Other:		1
> Combined:	2
>
> 2 combined queues by default when I have 2 CPUs.
>
> $ cat /proc/interrupts  | grep enp
>  127:  enp86s0
>  128:  enp86s0-rx-0
>  129:  enp86s0-rx-1
>  130:  enp86s0-tx-0
>  131:  enp86s0-tx-1
>
> 1 other IRQ, and 2 IRQs for each of RX and TX.
>
> Compare to netlink:
>
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                        --dump napi-get --json='{"ifindex": 2}'
> [{'id': 8196, 'ifindex': 2, 'irq': 131},
>  {'id': 8195, 'ifindex': 2, 'irq': 130},
>  {'id': 8194, 'ifindex': 2, 'irq': 129},
>  {'id': 8193, 'ifindex': 2, 'irq': 128}]
>
> So the driver has 4 IRQs linked to 4 different NAPIs, let's check queues:
>
> $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
>                          --dump queue-get --json='{"ifindex": 2}'
>
> [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
>  {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
>  {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
>
> In this case you can see that each RX and TX queue has a unique NAPI.
>
> I think this is correct, but slightly confusing :) as ethtool
> reports n/a for RX and TX and only reports a combined queue count,
> but you were correct that there was a bug for this case in the code
> I proposed in this RFC.
>
> I think this new output looks correct and will include the adjusted
> patch and a detailed commit message in the next RFC.
>
> Let me know if you think the output looks right to you now?

Looks good to me now.

Thanks,
Kurt

Download attachment "signature.asc" of type "application/pgp-signature" (862 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ