lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150324142921.GA2026@nanopsycho.orion>
Date:	Tue, 24 Mar 2015 15:29:21 +0100
From:	Jiri Pirko <jiri@...nulli.us>
To:	Scott Feldman <sfeldma@...il.com>
Cc:	roopa <roopa@...ulusnetworks.com>,
	Guenter Roeck <linux@...ck-us.net>,
	John Fastabend <john.fastabend@...il.com>,
	Andrew Lunn <andrew@...n.ch>,
	David Miller <davem@...emloft.net>,
	"Arad, Ronen" <ronen.arad@...el.com>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next RFC v2] switchdev: bridge: drop hardware
 forwarded packets

Tue, Mar 24, 2015 at 06:59:43AM CET, sfeldma@...il.com wrote:
>On Mon, Mar 23, 2015 at 10:12 AM, roopa <roopa@...ulusnetworks.com> wrote:
>> On 3/22/15, 8:33 PM, Guenter Roeck wrote:
>>>
>>> On 03/22/2015 08:18 PM, John Fastabend wrote:
>>> [ ... ]
>>>>
>>>>
>>>> Sorry I probably wasn't being clear. I'm just saying we don't need to
>>>> have the driver tell us if the packet has been forwarded. All we have
>>>> to do in software is the switch check and assume all packets sent to us
>>>> from the driver have already been handled by the hardware. Roopa is
>>>> working on this I believe.
>>>>
>>>
>>> Ah, ok. Yes, you are correct. Sorry, I missed that.
>>
>> yep, so my first RFC listed three ways to do this,
>> 1) flag on the bridge port
>> 2) check if the port being forwarded to is a switch port, using
>>             - the offload flag
>>             - the parent id (as john fastabend pointed out)
>> 3) A per packet flag which switch driver sets indicating that the packet is
>> hw forwarded.
>>     This is because we have run into cases where we want to move to software
>> forwarding
>>     of certain packets like igmp reports.  (I will get some more details on
>> the particular igmp problem).
>>     In such case, hardware punts the igmp packet to cpu and cpu will do the
>> forwarding.
>>     I think we may hit more cases like this in the future.
>>
>> my RFC v1 was based on 1). RFC v2 was based on 3) above.
>>
>> But, for now, agree that we can just support the more common case using 2).
>> And, we can move to 3) in the future if needed.
>
>Roopa, I think it may be possible to do this without any changes to
>the bridge code or switchdev code by dropping duplicate pkts in the
>swdev driver itself.  The skb is marked with skb_iif set to ifindex of
>ingress port, so when the driver goes to egress a pkt on the port, if
>the skb_iif is one of the other device ports, we can assume the device
>did the fwd already so we can drop the duplicate pkt.  Below is the
>change to rocker.  The driver can get as fancy as it wants in its test
>to drop or not.  This solution works for mixed offload and
>non-offloaded ports in a bridge, or ports from different offload
>devices in the same bridge.
>
>Yes, the bridge is spending overhead to clone pkts to flood to its
>ports.  IGMP snooping mitigates this for mcast.  BR_FLOOD can be
>turned off on the bridge ports to mitigate this for unknown unicast
>floods.  So what's left is bcasts.
>
>What do you think?   napi_id looked like another field that could be
>used to find the src of the pkt, but skb_iif seemed more straight
>forward to use.
>
>diff --git a/drivers/net/ethernet/rocker/rocker.c
>b/drivers/net/ethernet/rocker/rocker.c
>index aab962c..0f7217f7 100644
>--- a/drivers/net/ethernet/rocker/rocker.c
>+++ b/drivers/net/ethernet/rocker/rocker.c
>@@ -3931,15 +3931,28 @@ unmap_frag:
>        return -EMSGSIZE;
> }
>
>+static bool rocker_port_dev_check(struct net_device *dev);
>+
> static netdev_tx_t rocker_port_xmit(struct sk_buff *skb, struct
>net_device *dev)
> {
>        struct rocker_port *rocker_port = netdev_priv(dev);
>        struct rocker *rocker = rocker_port->rocker;
>        struct rocker_desc_info *desc_info;
>        struct rocker_tlv *frags;
>+       struct net_device *in_dev;
>        int i;
>        int err;
>
>+       if (rocker_port_is_bridged(rocker_port)) {
>+               rcu_read_lock();
>+               in_dev = dev_get_by_index_rcu(dev_net(dev), skb->skb_iif);

Hmm, you iterate over all ports for every xmit call :/
Would be nicer if skb_iif would be netdev poiter. Not sure it is doable.


>+               if (in_dev && rocker_port_dev_check(in_dev)) {
>+                       rcu_read_unlock();
>+                       goto skip;
>+               }
>+               rcu_read_unlock();
>+       }
>+
>        desc_info = rocker_desc_head_get(&rocker_port->tx_ring);
>        if (unlikely(!desc_info)) {
>                if (net_ratelimit())
>@@ -3951,7 +3964,7 @@ static netdev_tx_t rocker_port_xmit(struct
>sk_buff *skb, struct net_device *dev)
>
>        frags = rocker_tlv_nest_start(desc_info, ROCKER_TLV_TX_FRAGS);
>        if (!frags)
>-               goto out;
>+               goto drop;
>        err = rocker_tx_desc_frag_map_put(rocker_port, desc_info,
>                                          skb->data, skb_headlen(skb));
>        if (err)
>@@ -3983,9 +3996,10 @@ unmap_frags:
>        rocker_tx_desc_frags_unmap(rocker_port, desc_info);
> nest_cancel:
>        rocker_tlv_nest_cancel(desc_info, frags);
>-out:
>-       dev_kfree_skb(skb);
>+drop:
>        dev->stats.tx_dropped++;
>+skip:
>+       dev_kfree_skb(skb);
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ