netdev - Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210412213402.vwvon2fdtzf4hnrt@skbuf>
Date:   Tue, 13 Apr 2021 00:34:02 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Tobias Waldekranz <tobias@...dekranz.com>
Cc:     Marek Behun <marek.behun@....cz>,
        Ansuel Smith <ansuelsmth@...il.com>, netdev@...r.kernel.org,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        Eric Dumazet <edumazet@...gle.com>,
        Wei Wang <weiwan@...gle.com>,
        Cong Wang <cong.wang@...edance.com>,
        Taehee Yoo <ap420073@...il.com>,
        Björn Töpel <bjorn@...nel.org>,
        zhang kai <zhangkaiheb@....com>,
        Weilong Chen <chenweilong@...wei.com>,
        Roopa Prabhu <roopa@...ulusnetworks.com>,
        Di Zhu <zhudi21@...wei.com>,
        Francis Laniel <laniel_francis@...vacyrequired.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support

On Mon, Apr 12, 2021 at 11:22:45PM +0200, Tobias Waldekranz wrote:
> On Mon, Apr 12, 2021 at 21:30, Marek Behun <marek.behun@....cz> wrote:
> > On Mon, 12 Apr 2021 14:46:11 +0200
> > Tobias Waldekranz <tobias@...dekranz.com> wrote:
> >
> >> I agree. Unless you only have a few really wideband flows, a LAG will
> >> typically do a great job with balancing. This will happen without the
> >> user having to do any configuration at all. It would also perform well
> >> in "router-on-a-stick"-setups where the incoming and outgoing port is
> >> the same.
> >
> > TLDR: The problem with LAGs how they are currently implemented is that
> > for Turris Omnia, basically in 1/16 of configurations the traffic would
> > go via one CPU port anyway.
> >
> >
> >
> > One potencial problem that I see with using LAGs for aggregating CPU
> > ports on mv88e6xxx is how these switches determine the port for a
> > packet: only the src and dst MAC address is used for the hash that
> > chooses the port.
> >
> > The most common scenario for Turris Omnia, for example, where we have 2
> > CPU ports and 5 user ports, is that into these 5 user ports the user
> > plugs 5 simple devices (no switches, so only one peer MAC address for
> > port). So we have only 5 pairs of src + dst MAC addresses. If we simply
> > fill the LAG table as it is done now, then there is 2 * 0.5^5 = 1/16
> > chance that all packets would go through one CPU port.
> >
> > In order to have real load balancing in this scenario, we would either
> > have to recompute the LAG mask table depending on the MAC addresses, or
> > rewrite the LAG mask table somewhat randomly periodically. (This could
> > be in theory offloaded onto the Z80 internal CPU for some of the
> > switches of the mv88e6xxx family, but not for Omnia.)
> 
> I thought that the option to associate each port netdev with a DSA
> master would only be used on transmit. Are you saying that there is a
> way to configure an mv88e6xxx chip to steer packets to different CPU
> ports depending on the incoming port?
> 
> The reason that the traffic is directed towards the CPU is that some
> kind of entry in the ATU says so, and the destination of that entry will
> either be a port vector or a LAG. Of those two, only the LAG will offer
> any kind of balancing. What am I missing?
> 
> Transmit is easy; you are already in the CPU, so you can use an
> arbitrarily fancy hashing algo/ebpf classifier/whatever to load balance
> in that case.

Say a user port receives a broadcast frame. Based on your understanding
where user-to-CPU port assignments are used only for TX, which CPU port
should be selected by the switch for this broadcast packet, and by which
mechanism?