netdev - Re: [PATCH net-next 4/5] net: dsa: mv88e6xxx: Offload bridge learning flag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210317192929.qviweve6acjzrjcq@skbuf>
Date:   Wed, 17 Mar 2021 21:29:29 +0200
From:   Vladimir Oltean <olteanv@...il.com>
To:     Tobias Waldekranz <tobias@...dekranz.com>
Cc:     davem@...emloft.net, kuba@...nel.org, andrew@...n.ch,
        vivien.didelot@...il.com, f.fainelli@...il.com,
        netdev@...r.kernel.org
Subject: Re: [PATCH net-next 4/5] net: dsa: mv88e6xxx: Offload bridge
 learning flag

On Wed, Mar 17, 2021 at 07:45:46PM +0100, Tobias Waldekranz wrote:
> On Wed, Mar 17, 2021 at 16:12, Vladimir Oltean <olteanv@...il.com> wrote:
> > On Mon, Mar 15, 2021 at 10:13:59PM +0100, Tobias Waldekranz wrote:
> >> +	if (flags.mask & BR_LEARNING) {
> >> +		u16 pav = (flags.val & BR_LEARNING) ? (1 << port) : 0;
> >> +
> >> +		err = mv88e6xxx_port_set_assoc_vector(chip, port, pav);
> >> +		if (err)
> >> +			goto out;
> >> +	}
> >> +
> >
> > If flags.val & BR_LEARNING is off, could you please call
> > mv88e6xxx_port_fast_age too? This ensures that existing ATU entries that
> > were automatically learned are purged.
> 
> This opened up another can of worms.
> 
> It turns out that the hardware is incapable of fast aging a LAG.

You sound pretty definitive about it, do you know why?

> I can see two workarounds. Both are awful in their own special ways:
> 
> 1. Iterate over all entries of all FIDs in the ATU, removing all
>    matching dynamic entries. This will accomplish the same thing, but it
>    is a very expensive operation, and having that in the control path of
>    STP does not feel quite right.

When does it ever feel right? :)

I think of it like a faster 'bridge fdb' command (since 'bridge fdb'
traverses the ATU super inefficiently, it dumps the whole table for each
port).

On my system with 24 mv88e6xxx ports, 'time bridge fdb' takes around 34
seconds. So that means a 'slow age' will take around 1.4 seconds for a
single LAG.

On the other hand, on my system with 7 sja1105 ports, I have no choice
but to do slow ageing - the hardware simply doesn't have the concept of
'fast ageing'. There, 'time bridge fdb' returns 1.781s, so I expect a
slow age would take around 0.25 seconds. Of course I'm not happy about
it, but I think I'll bite the bullet.

> 2. Flushing all dynamic entries in the entire ATU. Fast, but obviously
>    results in a period of lots of flooded packets.

This one seems like an overreaction to me. Would that even solve the
problem? Couly you destroy and re-create the trunk?

> Any opinion on which approach you think would hurt less? Or, even
> better, if there is a third way that I have missed.
> 
> For this series I am leaning towards making mv88e6xxx_port_fast_age a
> no-op for LAG ports. We could then come back to this problem when we add
> other LAG-related FDB operations like static FDB entries. Acceptable?

Yeah, I guess that's fair.