[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210820093723.qdvnvdqjda3m52v6@skbuf>
Date: Fri, 20 Aug 2021 12:37:23 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Ido Schimmel <idosch@...sch.org>
Cc: Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Roopa Prabhu <roopa@...dia.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...il.com>,
Vadym Kochan <vkochan@...vell.com>,
Taras Chornyi <tchornyi@...vell.com>,
Jiri Pirko <jiri@...dia.com>, Ido Schimmel <idosch@...dia.com>,
UNGLinuxDriver@...rochip.com,
Grygorii Strashko <grygorii.strashko@...com>,
Marek Behun <kabel@...ckhole.sk>,
DENG Qingfang <dqfext@...il.com>,
Kurt Kanzenbach <kurt@...utronix.de>,
Hauke Mehrtens <hauke@...ke-m.de>,
Woojung Huh <woojung.huh@...rochip.com>,
Sean Wang <sean.wang@...iatek.com>,
Landen Chao <Landen.Chao@...iatek.com>,
Claudiu Manoil <claudiu.manoil@....com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
George McCollister <george.mccollister@...il.com>,
Ioana Ciornei <ioana.ciornei@....com>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>,
Lars Povlsen <lars.povlsen@...rochip.com>,
Steen Hegelund <Steen.Hegelund@...rochip.com>,
Julian Wiedmann <jwi@...ux.ibm.com>,
Karsten Graul <kgraul@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
Ivan Vecera <ivecera@...hat.com>,
Vlad Buslov <vladbu@...dia.com>,
Jianbo Liu <jianbol@...dia.com>,
Mark Bloch <mbloch@...dia.com>, Roi Dayan <roid@...dia.com>,
Tobias Waldekranz <tobias@...dekranz.com>,
Vignesh Raghavendra <vigneshr@...com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>
Subject: Re: [PATCH v2 net-next 0/5] Make SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
blocking
On Fri, Aug 20, 2021 at 12:16:10PM +0300, Ido Schimmel wrote:
> On Thu, Aug 19, 2021 at 07:07:18PM +0300, Vladimir Oltean wrote:
> > Problem statement:
> >
> > Any time a driver needs to create a private association between a bridge
> > upper interface and use that association within its
> > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handler, we have an issue with FDB
> > entries deleted by the bridge when the port leaves. The issue is that
> > all switchdev drivers schedule a work item to have sleepable context,
> > and that work item can be actually scheduled after the port has left the
> > bridge, which means the association might have already been broken by
> > the time the scheduled FDB work item attempts to use it.
>
> This is handled in mlxsw by telling the device to flush the FDB entries
> pointing to the {port, FID} when the VLAN is deleted (synchronously).
Again, central solution vs mlxsw solution.
If a port leaves a LAG that is offloaded but the LAG does not leave the
bridge, the driver still needs to initiate the VLAN deletion. I really
don't like that, it makes switchdev drivers bloated.
As long as you call switchdev_bridge_port_unoffload and you populate the
blocking notifier pointer, you will get replays of item deletion from
the bridge.
> > The solution is to modify switchdev to use its embedded SWITCHDEV_F_DEFER
> > mechanism to make the FDB notifiers emitted from the fastpath be
> > scheduled in sleepable context. All drivers are converted to handle
> > SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE from their blocking notifier block
> > handler (or register a blocking switchdev notifier handler if they
> > didn't have one). This solves the aforementioned problem because the
> > bridge waits for the switchdev deferred work items to finish before a
> > port leaves (del_nbp calls switchdev_deferred_process), whereas a work
> > item privately scheduled by the driver will obviously not be waited upon
> > by the bridge, leading to the possibility of having the race.
>
> How the problem is solved if after this patchset drivers still queue a
> work item?
It's only a problem if you bank on any stateful association between FDB
entries and your ports (aka you expect that port->bridge_dev still holds
the same value in the atomic handler as in the deferred work item). I
think drivers don't do this at the moment, otherwise they would be
broken.
When they need that, they will convert to synchronous handling and all
will be fine.
> DSA supports learning, but does not report the entries to the bridge.
Why is this relevant exactly?
> How are these entries deleted when a port leaves the bridge?
dsa_port_fast_age does the following
(a) deletes the hardware learned entries on a port, in all VLANs
(b) notifies the bridge to also flush its software FDB on that port
It is called
(a) when the STP state changes from a learning-capable state (LEARNING,
FORWARDING) to a non-learning capable state (BLOCKING, LISTENING)
(b) when learning is turned off by the user
(c) when learning is turned off by the port becoming standalone after
leaving a bridge (actually same code path as b)
So the FDB of a port is also flushed when a single switch port leaves a
LAG that is the actual bridge port (maybe not ideal, but I don't know
any better).
> > This is a dependency for the "DSA FDB isolation" posted here. It was
> > split out of that series hence the numbering starts directly at v2.
> >
> > https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
>
> What is FDB isolation? Cover letter says: "There are use cases which
> need FDB isolation between standalone ports and bridged ports, as well
> as isolation between ports of different bridges".
FDB isolation means exactly what it says: that the hardware FDB lookup
of ports that are standalone, or under one bridge, is unable to find FDB entries
(same MAC address, same VID) learned on another port from another bridge.
> Does it mean that DSA currently forwards packets between ports even if
> they are member in different bridges or standalone?
No, that is plain forwarding isolation in my understanding of terms, and
we have had that for many years now.
Powered by blists - more mailing lists