[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55AD3687.9010002@gmail.com>
Date: Mon, 20 Jul 2015 10:57:27 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: sfeldma@...il.com, netdev@...r.kernel.org
CC: jiri@...nulli.us, roopa@...ulusnetworks.com,
simon.horman@...ronome.com, nicolas.dichtel@...nd.com,
andrew@...n.ch, vivien.didelot@...oirfairelinux.com,
linux@...ck-us.net
Subject: Re: [PATCH net-next v3 0/5] switchdev: avoid duplicate packet forwarding
Hi Scott,
On 18/07/15 18:24, sfeldma@...il.com wrote:
> From: Scott Feldman <sfeldma@...il.com>
>
> v3:
>
> - Per Nicolas Dichtel review: remove errant empty union.
>
> v2:
>
> - Per davem review: in sk_buff, union fwd_mark with secmark to save space
> since features appear to be mutually exclusive.
> - Per Simon Horman review:
> - fix grammar in switchdev.txt wrt fwd_mark
> - remove some unrelated changes that snuck in
>
> v1:
>
> This patchset was previously submitted as RFC. No changes from the last
> version (v2) sent under RFC. Including RFC version history here for reference.
This looks good to me, and it looks like this is also relevant for
DSA-driven switches, however I am not exactly clear on how we could
potentially use this on such switches.
For Broadcom tags, there is a reason code forwarded along with the tag
making it to the CPU, and the "mirror" bit sounds like something that
should dictate whether we should be setting the fwd_mark or not.
What does it look like on Marvell switches?
>
> RFC v2:
>
> - s/fwd_mark/offload_fwd_mark
> - use consume_skb rather than kfree_skb when dropping pkt on egress.
> - Use Jiri's suggestion to use ifindex of one of the ports in a group
> as the mark for all the ports in the group. This can be done with
> no additional storage (no hashtable from v1). To pull it off, we
> need some simple recursive routines to walk the netdev tree ensuring
> all leaves in the tree (ports) in the same group (e.g. bridge)
> belonging to the same switch device will have the same offload fwd mark.
> Maybe someone sees a better design for the recusive routines? They're
> not too bad, and should cover the stacked driver cases.
>
> RFC v1:
>
> With switchdev support for offloading L2/L3 forwarding data path to a
> switch device, we have a general problem where both the device and the
> kernel may forward the packet, resulting in duplicate packets on the wire.
> Anytime a packet is forwarded by the device and a copy is sent to the CPU,
> there is potential for duplicate forwarding, as the kernel may also do a
> forwarding lookup and send the packet on the wire.
>
> The specific problem this patch series is interested in solving is avoiding
> duplicate packets on bridged ports. There was a previous RFC from Roopa
> (http://marc.info/?l=linux-netdev&m=142687073314252&w=2) to address this
> problem, but didn't solve the problem of mixed ports in the bridge from
> different devices; there was no way to exclude some ports from forwarding
> and include others. This RFC solves that problem by tagging the ingressing
> packet with a unique mark, and then comparing the packet mark with the
> egress port mark, and skip forwarding when there is a match. For the mixed
> ports bridge case, only those ports with matching marks are skipped.
>
> The switchdev port driver must do two things:
>
> 1) Generate a fwd_mark for each switch port, using some unique key of the
> switch device (and optionally port). This is done when the port netdev
> is registered or if the port's group membership changes (joins/leaves
> a bridge, for example).
>
> 2) On packet ingress from port, mark the skb with the ingress port's
> fwd_mark. If the device supports it, it's useful to only mark skbs
> which were already forwarded by the device. If the device does not
> support such indication, all skbs can be marked, even if they're
> local dst.
>
> Two new 32-bit fields are added to struct sk_buff and struct netdevice to
> hold the fwd_mark. I've wrapped these with CONFIG_NET_SWITCHDEV for now. I
> tried using skb->mark for this purpose, but ebtables can overwrite the
> skb->mark before the bridge gets it, so that will not work.
>
> In general, this fwd_mark can be used for any case where a packet is
> forwarded by the device and a copy is sent to the CPU, to avoid the kernel
> re-forwarding the packet. sFlow is another use-case that comes to mind,
> but I haven't explored the details.
> Scott Feldman (5):
> net: don't reforward packets already forwarded by offload device
> net: add phys ID compare helper to test if two IDs are the same
> switchdev: add offload_fwd_mark generator helper
> rocker: add offload_fwd_mark support
> switchdev: update documentation for offload_fwd_mark
>
> Documentation/networking/switchdev.txt | 14 +++-
> drivers/net/ethernet/rocker/rocker.c | 11 ++++
> drivers/net/ethernet/rocker/rocker.h | 1 +
> include/linux/netdevice.h | 13 ++++
> include/linux/skbuff.h | 9 ++-
> include/net/switchdev.h | 9 +++
> net/core/dev.c | 10 +++
> net/switchdev/switchdev.c | 111 ++++++++++++++++++++++++++++++--
> 8 files changed, 169 insertions(+), 9 deletions(-)
>
--
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists