[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200524142609.GB1281067@splinter>
Date: Sun, 24 May 2020 17:26:09 +0300
From: Ido Schimmel <idosch@...sch.org>
To: Vladimir Oltean <olteanv@...il.com>
Cc: andrew@...n.ch, f.fainelli@...il.com, vivien.didelot@...il.com,
davem@...emloft.net, jiri@...nulli.us, kuba@...nel.org,
ivecera@...hat.com, netdev@...r.kernel.org,
horatiu.vultur@...rochip.com, allan.nielsen@...rochip.com,
nikolay@...ulusnetworks.com, roopa@...ulusnetworks.com
Subject: Re: [PATCH RFC net-next 10/13] net: bridge: add port flags for host
flooding
On Fri, May 22, 2020 at 12:10:33AM +0300, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@....com>
>
> In cases where the bridge is offloaded by a switchdev, there are
> situations where we can optimize RX filtering towards the host. To be
> precise, the host only needs to do termination, which it can do by
> responding at the MAC addresses of the slave ports and of the bridge
> interface itself. But most notably, it doesn't need to do forwarding,
> so there is no need to see packets with unknown destination address.
>
> But there are, however, cases when a switchdev does need to flood to the
> CPU. Such an example is when the switchdev is bridged with a foreign
> interface, and since there is no offloaded datapath, packets need to
> pass through the CPU. Currently this is the only identified case, but it
> can be extended at any time.
>
> So far, switchdev implementers made driver-level assumptions, such as:
> this chip is never integrated in SoCs where it can be bridged with a
> foreign interface, so I'll just disable host flooding and save some CPU
> cycles. Or: I can never know what else can be bridged with this
> switchdev port, so I must leave host flooding enabled in any case.
>
> Let the bridge drive the host flooding decision, and pass it to
> switchdev via the same mechanism as the external flooding flags.
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
> include/linux/if_bridge.h | 3 +++
> net/bridge/br_if.c | 40 +++++++++++++++++++++++++++++++++++++++
> net/bridge/br_switchdev.c | 4 +++-
> 3 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index b3a8d3054af0..6891a432862d 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -49,6 +49,9 @@ struct br_ip_list {
> #define BR_ISOLATED BIT(16)
> #define BR_MRP_AWARE BIT(17)
> #define BR_MRP_LOST_CONT BIT(18)
> +#define BR_HOST_FLOOD BIT(19)
> +#define BR_HOST_MCAST_FLOOD BIT(20)
> +#define BR_HOST_BCAST_FLOOD BIT(21)
>
> #define BR_DEFAULT_AGEING_TIME (300 * HZ)
>
> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
> index a0e9a7937412..aae59d1e619b 100644
> --- a/net/bridge/br_if.c
> +++ b/net/bridge/br_if.c
> @@ -166,6 +166,45 @@ void br_manage_promisc(struct net_bridge *br)
> }
> }
>
> +static int br_manage_host_flood(struct net_bridge *br)
> +{
> + const unsigned long mask = BR_HOST_FLOOD | BR_HOST_MCAST_FLOOD |
> + BR_HOST_BCAST_FLOOD;
> + struct net_bridge_port *p, *q;
> +
> + list_for_each_entry(p, &br->port_list, list) {
> + unsigned long flags = p->flags;
> + bool sw_bridging = false;
> + int err;
> +
> + list_for_each_entry(q, &br->port_list, list) {
> + if (p == q)
> + continue;
> +
> + if (!netdev_port_same_parent_id(p->dev, q->dev)) {
> + sw_bridging = true;
It's not that simple. There are cases where not all bridge slaves have
the same parent ID and still there is no reason to flood traffic to the
CPU. VXLAN, for example.
You could argue that the VXLAN device needs to have the same parent ID
as the physical netdevs member in the bridge, but it will break your
data path. For example, lets assume your hardware decided to flood a
packet in L2. The packet will egress all the local ports, but will also
perform VXLAN encapsulation. The packet continues with the IP of the
remote VTEP(s) to the underlay router and then encounters a neighbour
miss exception, which sends it to the CPU for resolution.
Since this exception was encountered in the router the driver would mark
the packet with 'offload_fwd_mark', as it already performed L2
forwarding. If the VXLAN device has the same parent ID as the physical
netdevs, then the Linux bridge will never let it egress, nothing will
trigger neighbour resolution and the packet will be discarded.
> + break;
> + }
> + }
> +
> + if (sw_bridging)
> + flags |= mask;
> + else
> + flags &= ~mask;
> +
> + if (flags == p->flags)
> + continue;
> +
> + err = br_switchdev_set_port_flag(p, flags, mask);
> + if (err)
> + return err;
> +
> + p->flags = flags;
> + }
> +
> + return 0;
> +}
> +
> int nbp_backup_change(struct net_bridge_port *p,
> struct net_device *backup_dev)
> {
> @@ -231,6 +270,7 @@ static void nbp_update_port_count(struct net_bridge *br)
> br->auto_cnt = cnt;
> br_manage_promisc(br);
> }
> + br_manage_host_flood(br);
> }
>
> static void nbp_delete_promisc(struct net_bridge_port *p)
> diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
> index 015209bf44aa..360806ac7463 100644
> --- a/net/bridge/br_switchdev.c
> +++ b/net/bridge/br_switchdev.c
> @@ -56,7 +56,9 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p,
>
> /* Flags that can be offloaded to hardware */
> #define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | \
> - BR_MCAST_FLOOD | BR_BCAST_FLOOD)
> + BR_MCAST_FLOOD | BR_BCAST_FLOOD | \
> + BR_HOST_FLOOD | BR_HOST_MCAST_FLOOD | \
> + BR_HOST_BCAST_FLOOD)
>
> int br_switchdev_set_port_flag(struct net_bridge_port *p,
> unsigned long flags,
> --
> 2.25.1
>
Powered by blists - more mailing lists