netdev - Re: [RFC 1/2] net-next: fix DSA flow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <390b4235-1bd4-89d2-3a5f-bc51e3599247@phrozen.org>
Date:   Tue, 20 Jun 2017 19:37:35 +0200
From:   John Crispin <john@...ozen.org>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Vivien Didelot <vivien.didelot@...oirfairelinux.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Sean Wang <sean.wang@...iatek.com>, netdev@...r.kernel.org
Subject: Re: [RFC 1/2] net-next: fix DSA flow_disection

On 20/06/17 16:01, Andrew Lunn wrote:
> On Tue, Jun 20, 2017 at 10:06:54AM +0200, John Crispin wrote:
>> RPS and probably other kernel features are currently broken on some if not
>> all DSA devices. The root cause of this that skb_hash will call the
>> flow_disector.
> Hi John
>
> What is the call path when the flow_disector is called? I'm wondering
> if we can defer this, and call it later, after the tag code has
> removed the header.
>
> 	Andrew

Hi Andrew,

the ethernet driver receives the frame and passes it down the line. 
Eventually it ends up inside netif_receive_skb_internal() where it gets 
added to the backlog. At this point get_rps_cpu() is called. Inside 
get_rps_cpu() the skb_get_hash() is called which utilizes the 
flow_dissector() ... which is broken for DSA devices. get_rps_cpu() will 
always return the same hash for all flows and the frame is always added 
to the backlog on the same core. Once inside the backlog it will 
traverse through the dsa layer and end up inside the tag driver and be 
passed to the slave device for further processing and keep its bad flow 
hash for its whole life cycle.

In theory we could reset the hash inside the tag driver but ideally the 
whole life cycle of the frame should happen on the same core to avoid 
possible reordering issues. In addition RPS is broken until the frame 
reaches the tag driver. In the case of the mediatek mt7623 we only have 
1 RX IRQ and in the worst case the RPS of the frame while still inside 
ethX will happen on the same core as where we handle IRQs. This will 
increase the IRQ latency and reduce the free cpu time, thus reducing 
maximum throughput. I did test resetting the hash inside the tag driver. 
Calculating the correct hash from the start did yield a huge performance 
difference however, at least on mt7623. We are talking about 30% extra 
max throughput. This might not be such a big problem if the SoC has a 
multi queue ethernet core but on mt7623 it does make a huge difference 
if we can use RPS to delegate all frame processing away from the core 
handling the IRQs.

     John