netdev - Re: lantiq_xrx200: Ethernet MAC with multiple TX queues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7510c29a-b60f-e0d7-4129-cb90fe376c74@gmail.com>
Date:   Wed, 24 Mar 2021 16:07:47 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     Martin Blumenstingl <martin.blumenstingl@...glemail.com>,
        netdev@...r.kernel.org, Hauke Mehrtens <hauke@...ke-m.de>,
        andrew@...n.ch, vivien.didelot@...il.com, davem@...emloft.net,
        kuba@...nel.org
Subject: Re: lantiq_xrx200: Ethernet MAC with multiple TX queues



On 3/24/2021 3:21 PM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Wed, Mar 24, 2021 at 02:09:02PM -0700, Florian Fainelli wrote:
>>
>>
>> On 3/24/2021 1:13 PM, Vladimir Oltean wrote:
>>> Hi Martin,
>>>
>>> On Wed, Mar 24, 2021 at 09:04:16PM +0100, Martin Blumenstingl wrote:
>>>> Hello,
>>>>
>>>> the PMAC (Ethernet MAC) IP built into the Lantiq xRX200 SoCs has
>>>> support for multiple (TX) queues.
>>>> This MAC is connected to the SoC's built-in switch IP (called GSWIP).
>>>>
>>>> Right now the lantiq_xrx200 driver only uses one TX and one RX queue.
>>>> The vendor driver (which mixes DSA/switch and MAC functionality in one
>>>> driver) uses the following approach:
>>>> - eth0 ("lan") uses the first TX queue
>>>> - eth1 ("wan") uses the second TX queue
>>>>
>>>> With the current (mainline) lantiq_xrx200 driver some users are able
>>>> to fill up the first (and only) queue.
>>>> This is why I am thinking about adding support for the second queue to
>>>> the lantiq_xrx200 driver.
>>>>
>>>> My main question is: how do I do it properly?
>>>> Initializing the second TX queue seems simple (calling
>>>> netif_tx_napi_add for a second time).
>>>> But how do I choose the "right" TX queue in xrx200_start_xmit then?
>>
>> If you use DSA you will have a DSA slave network device which will be
>> calling into dev_queue_xmit() into the DSA master which will be the
>> xrx200 driver, so it's fairly simple for you to implement a queue
>> selection within the xrx200 tagger for instance.
>>
>> You can take a look at how net/dsa/tag_brcm.c and
>> drivers/net/ethernet/broadcom/bcmsysport.c work as far as mapping queues
>> from the DSA slave network device queue/port number into a queue number
>> for the DSA master.
> 
> What are the benefits of mapping packets to TX queues of the DSA master
> from the DSA layer?

For systemport and bcm_sf2 this was explained in this commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d156576362c07e954dc36e07b0d7b0733a010f7d

in a nutshell, the switch hardware can return the queue status back to
the systemport's transmit DMA such that it can automatically pace the TX
completion interrupts. To do that we need to establish a mapping between
the DSA slave and master that is comprised of the switch port number and
TX queue number, and tell the HW to inspect the congestion status of
that particular port and queue.

What this is meant to address is a "lossless" (within the SoC at least)
behavior when you have user ports that are connected at a speed lower
than that of your internal connection to the switch typically Gigabit or
more. If you send 1Gbits/sec worth of traffic down to a port that is
connected at 100Mbits/sec there will be roughly 90% packet loss unless
you have a way to pace the Ethernet controller's transmit DMA, which
then ultimately limits the TX completion of the socket buffers so things
work nicely. I believe that per queue flow control was evaluated before
and an out of band mechanism was preferred but I do not remember the
details of that decision to use ACB.
-- 
Florian