[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170905171141.7040519b@xeon-e3>
Date: Tue, 5 Sep 2017 17:11:41 -0700
From: Stephen Hemminger <stephen@...workplumber.org>
To: Andrew Lunn <andrew@...n.ch>
Cc: netdev <netdev@...r.kernel.org>, jiri@...nulli.us,
nikolay@...ulusnetworks.com,
Florian Fainelli <f.fainelli@...il.com>,
Vivien Didelot <vivien.didelot@...oirfairelinux.com>
Subject: Re: [PATCH v2 rfc 0/8] IGMP snooping for local traffic
On Wed, 6 Sep 2017 01:35:02 +0200
Andrew Lunn <andrew@...n.ch> wrote:
> After the very useful feedback from Nikolay, i threw away what i had,
> and started again. To recap:
>
> The linux bridge supports IGMP snooping. It will listen to IGMP
> reports on bridge ports and keep track of which groups have been
> joined on an interface. It will then forward multicast based on this
> group membership.
>
> When the bridge adds or removed groups from an interface, it uses
> switchdev to request the hardware add an mdb to a port, so the
> hardware can perform the selective forwarding between ports.
>
> What is not covered by the current bridge code, is IGMP joins/leaves
> from the host on the brX interface. These are not reported via
> switchdev so that hardware knows the local host is interested in the
> multicast frames.
>
> Luckily, the bridge does track joins/leaves on the brX interface. The
> code is obfusticated, which is why i missed it with my first attempt.
> So the first patch tries to remove this obfustication. Currently,
> there is no notifications sent when the bridge interface joins a
> group. The second patch adds them. bridge monitor then shows
> joins/leaves in the same way as for other ports of the bridge.
>
> Then starts the work passing down to the hardware that the host has
> joined/left a group. The existing switchdev mdb object cannot be used,
> since the semantics are different. The existing
> SWITCHDEV_OBJ_ID_PORT_MDB is used to indicate a specific multicast
> group should be forwarded out that port of the switch. However here we
> require the exact opposite. We want multicast frames for the group
> received on the port to the forwarded to the host. Hence add a new
> object SWITCHDEV_OBJ_ID_HOST_MDB, a multicast database entry to
> forward to the host. This new object is then propagated through the
> DSA layers. No DSA driver changes should be needed, this should just
> work...
>
> Getting the frames to the bridge as requested turned up an issue or
> three. The offload_fwd_mark is not being set by DSA, so the bridge
> floods the received frames back to the switch ports, resulting in
> duplication since the hardware has already flooded the packet. Fixing
> that turned up an issue with the meaning of
> SWITCHDEV_ATTR_ID_PORT_PARENT_ID in DSA. A DSA fabric of three
> switches needs to look to the software bridge as a single
> switch. Otherwise the offload_fwd_mark does not work, and we get
> duplication on the non-ingress switch. But each switch returned a
> different value. And they were not unique.
>
> The third and last issue will be explained in a followup email.
>
> Open questions:
>
> Is sending notifications going to break userspace?
> Is this new switchdev object O.K. for the few non-DSA switches that exist?
> Is the SWITCHDEV_ATTR_ID_PORT_PARENT_ID change acceptable?
>
> Andrew
>
> Andrew Lunn (8):
> net: bridge: Rename mglist to host_joined
> net: bridge: Send notification when host join/leaves a group
> net: bridge: Add/del switchdev object on host join/leave
> net: dsa: slave: Handle switchdev host mdb add/del
> net: dsa: switch: handle host mdb add/remove
> net: dsa: switch: Don't add CPU port to an mdb by default
> net: dsa: set offload_fwd_mark on received packets
> net: dsa: Fix SWITCHDEV_ATTR_ID_PORT_PARENT_ID
>
> include/net/switchdev.h | 1 +
> net/bridge/br_input.c | 2 +-
> net/bridge/br_mdb.c | 50 +++++++++++++++++++++++++++++---
> net/bridge/br_multicast.c | 18 +++++++-----
> net/bridge/br_private.h | 2 +-
> net/dsa/dsa.c | 1 +
> net/dsa/dsa_priv.h | 7 +++++
> net/dsa/port.c | 26 +++++++++++++++++
> net/dsa/slave.c | 16 ++++++++---
> net/dsa/switch.c | 72 +++++++++++++++++++++++++++++++++++++++--------
> net/switchdev/switchdev.c | 2 ++
> 11 files changed, 168 insertions(+), 29 deletions(-)
>
This looks much cleaner. I don't have DSA hardware or infrastructure to look deeper.
Powered by blists - more mailing lists