[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+h21hrhcqpF+nTmG6057ckB+CzHQGC+F5_bbAK7TXxmpvzNBQ@mail.gmail.com>
Date: Mon, 4 May 2020 16:15:09 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: DENG Qingfang <dqfext@...il.com>
Cc: netdev <netdev@...r.kernel.org>,
Sean Wang <sean.wang@...iatek.com>,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
"David S . Miller" <davem@...emloft.net>,
"moderated list:ARM/Mediatek SoC support"
<linux-mediatek@...ts.infradead.org>,
Russell King <linux@...linux.org.uk>,
Matthias Brugger <matthias.bgg@...il.com>,
René van Dorst <opensource@...rst.com>,
Tom James <tj17@...com>,
Stijn Segers <foss@...atilesystems.org>,
riddlariddla@...mail.com, Szabolcs Hubai <szab.hu@...il.com>,
Paul Fertser <fercerpav@...il.com>
Subject: Re: [RFC PATCH net-next] net: dsa: mt7530: fix roaming from DSA user ports
Hi Qingfang,
On Mon, 4 May 2020 at 15:47, DENG Qingfang <dqfext@...il.com> wrote:
>
> Hi Vladimir,
>
> On Mon, May 4, 2020 at 6:23 PM Vladimir Oltean <olteanv@...il.com> wrote:
> >
> > Hi Qingfang,
> >
> > On Sat, 25 Apr 2020 at 15:03, DENG Qingfang <dqfext@...il.com> wrote:
> > >
> > > When a client moves from a DSA user port to a software port in a bridge,
> > > it cannot reach any other clients that connected to the DSA user ports.
> > > That is because SA learning on the CPU port is disabled, so the switch
> > > ignores the client's frames from the CPU port and still thinks it is at
> > > the user port.
> > >
> > > Fix it by enabling SA learning on the CPU port.
> > >
> > > To prevent the switch from learning from flooding frames from the CPU
> > > port, set skb->offload_fwd_mark to 1 for unicast and broadcast frames,
> > > and let the switch flood them instead of trapping to the CPU port.
> > > Multicast frames still need to be trapped to the CPU port for snooping,
> > > so set the SA_DIS bit of the MTK tag to 1 when transmitting those frames
> > > to disable SA learning.
> > >
> > > Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
> > > Signed-off-by: DENG Qingfang <dqfext@...il.com>
> > > ---
> >
> > I think enabling learning on the CPU port would fix the problem
> > sometimes, but not always. (actually nothing can solve it always, see
> > below)
> > The switch learns the new route only if it receives any packets from
> > the CPU port, with a SA equal to the station you're trying to reach.
> > But what if the station is not sending any traffic at the moment,
> > because it is simply waiting for connections to it first (just an
> > example)?
> > Unless there is any traffic already coming from the destination
> > station too, your patch won't work.
> > I am currently facing a similar situation with the ocelot/felix
> > switches, but in that case, enabling SA learning on the CPU port is
> > not possible.
>
> Why is it not possible?
>
Because learning on the CPU port is not supported on this hardware.
> Then try my previous RFC patch
> "net: bridge: fix client roaming from DSA user port"
> It tries removing entries from the switch when the client moves to another port.
>
Your patch only deletes FDB entries of packets received in the
fastpath by the software bridge, which as I said, won't work if the
software bridge doesn't receive packets in the first place due to a
stale FDB entry.
> > The way I dealt with it is by forcing a flush of the FDB entries on
> > the port, in the following scenarios:
> > - link goes down
> > - port leaves its bridge
> > So traffic towards a destination that has migrated away will
> > temporarily be flooded again (towards the CPU port as well).
> > There is still one case which isn't treated using this approach: when
> > the station migrates away from a switch port that is not directly
> > connected to this one. So no "link down" events would get generated in
> > that case. We would still have to wait until the address expires in
> > that case. I don't think that particular situation can be solved.
>
> You're right. Every switch has this issue, even Linux bridge.
>
> > My point is: if we agree that this is a larger problem, then DSA
> > should have a .port_fdb_flush method and schedule a workqueue whenever
> > necessary. Yes, it is a costly operation, but it will still probably
> > take a lot less than the 300 seconds that the bridge configures for
> > address ageing.
> >
> > Thoughts?
> >
> >
> > Thanks,
> > -Vladimir
Regards,
-Vladimir
Powered by blists - more mailing lists