lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sg8h5lsc.fsf@waldekranz.com>
Date:   Mon, 07 Dec 2020 22:49:39 +0100
From:   Tobias Waldekranz <tobias@...dekranz.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     davem@...emloft.net, kuba@...nel.org, andrew@...n.ch,
        vivien.didelot@...il.com, f.fainelli@...il.com,
        j.vosburgh@...il.com, vfalico@...il.com, andy@...yhouse.net,
        netdev@...r.kernel.org
Subject: Re: [PATCH v3 net-next 2/4] net: dsa: Link aggregation support

On Fri, Dec 04, 2020 at 02:56, Vladimir Oltean <olteanv@...il.com> wrote:
> On Fri, Dec 04, 2020 at 12:12:32AM +0100, Tobias Waldekranz wrote:
>> You make a lot of good points. I think it might be better to force the
>> user to be explicit about their choice though. Imagine something like
>> this:
>>
>> - We add NETIF_F_SWITCHDEV_OFFLOAD, which is set on switchdev ports by
>>   default. This flag is only allowed to be toggled when the port has no
>>   uppers - we do not want to deal with a port in a LAG in a bridge all
>>   of a sudden changing mode.
>>
>> - If it is set, we only allow uppers/tc-rules/etc that we can
>>   offload. If the user tries to configure something outside of that, we
>>   can suggest disabling offloading in the error we emit.
>>
>> - If it is not set, we just sit back and let the kernel do its thing.
>>
>> This would work well both for exotic LAG modes and for advanced
>> netfilter(ebtables)/tc setups I think. Example session:
>>
>> $ ip link add dev bond0 type bond mode balance-rr
>> $ ip link set dev swp0 master bond0
>> Error: swp0: balance-rr not supported when using switchdev offloading
>> $ ethtool -K swp0 switchdev off
>> $ ip link set dev swp0 master bond0
>> $ echo $?
>> 0
>
> And you want the default to be what, on or off? I believe on?
> I'd say the default should be off. The idea being that you could have
> "write once, run everywhere" types of scripts. You can only get that
> behavior with "off", otherwise you'd get random errors on some equipment
> and it wouldn't be portable. And "ethtool -K swp0 switchdev off" is a
> bit of a strange incantation to add to every script just to avoid
> errors.. But if the default switchdev mode is off, then what's the
> point in even having the knob, your poor Linus will still be confused
> and frustrated, and it won't help him any bit if he can flip the switch
> - it's too late, he already knows what the problem is by the time he
> finds the switch.

Yeah I can not argue with that. OK, I surrender, software fallback it
is.

>> > I would even go out on a limb and say hardcode the TX_TYPE_HASH in DSA
>> > for now. I would be completely surprised to see hardware that can
>> > offload anything else in the near future.
>>
>> If you tilt your head a little, I think active backup is really just a
>> trivial case of a hashed LAG wherein only a single member is ever
>> active. I.e. all buckets are always allocated to one port (effectivly
>> negating the hashing). The active member is controlled by software, so I
>> think we should be able to support that.
>
> Yup, my head is tilted and I see it now. If I understand this mode
> (never used it), then any hardware switch that can offload bridging can
> also offload the active-backup LAG.

Neither have I, but I guess you still need an actual LAG to associate
neighbors with, instead of the physical port? Maybe ocelot is different,
but on mv88e6xxx you otherwise get either (1) packet loss when the
active member changes or (2) duplicated packets when more than one
member is active.

>> mv88e6xxx could also theoretically be made to support broadcast. You can
>> enable any given bucket on multiple ports, but that seems silly.
>
> Yeah, the broadcast bonding mode looks like an oddball. It sounds to me
> almost like HSR/PRP/FRER but without the sequence numbering, which is a
> surefire way to make a mess out of everything. I have no idea how it is
> used (how duplicate elimination is achieved).

That is the way I interpret it as well. I suppose the dedup is done on
some higher layer, or it is some kind of redundant rx-only recorder or
something.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ