netdev - Re: [PATCH net-next v4] Documentation: networking: Clarify switchdev devices behavior

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190111192056.GA13325@splinter.mtl.com>
Date:   Fri, 11 Jan 2019 19:20:59 +0000
From:   Ido Schimmel <idosch@...lanox.com>
To:     Florian Fainelli <f.fainelli@...il.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "vivien.didelot@...il.com" <vivien.didelot@...il.com>,
        "cphealy@...il.com" <cphealy@...il.com>,
        Jiri Pirko <jiri@...lanox.com>,
        "bridge@...ts.linux-foundation.org" 
        <bridge@...ts.linux-foundation.org>,
        "nikolay@...ulusnetworks.com" <nikolay@...ulusnetworks.com>,
        "roopa@...ulusnetworks.com" <roopa@...ulusnetworks.com>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        "ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
        "ivan.khoronzhuk@...aro.org" <ivan.khoronzhuk@...aro.org>
Subject: Re: [PATCH net-next v4] Documentation: networking: Clarify switchdev
 devices behavior

On Fri, Jan 11, 2019 at 10:34:01AM -0800, Florian Fainelli wrote:
> On 1/11/19 7:06 AM, Ido Schimmel wrote:
> > On Thu, Jan 10, 2019 at 11:32:06AM -0800, Florian Fainelli wrote:
> >> +- with VLAN filtering turned on: frames ingressing the device with a VID that is
> >> +  not programmed into the bridges/switch's VLAN table must be dropped.
> >> +
> >> +Non-bridged network ports of the same switch fabric must not be disturbed in any
> >> +way, shape or form by the enabling of VLAN filtering.
> > 
> > "shape or form" ?
> 
> It's just an expression, I can remove it :)

Yes, please. I think it is weird to use it in this context.

> >> +VLAN devices configured on top of a switchdev network device (e.g: sw0p1.100)
> >> +which is a bridge port member must also observe the following behavior:
> > 
> > It is not clear where VLAN filtering is on / off. On the bridge the VLAN
> > device is enslaved to I believe? Not the bridge the physical port is
> > enslaved to.
> 
> Actually the later, at least in the hardware that I have access to, VLAN
> filtering is global to the entire switch, whether the physical switch
> ports are enslaved in a bridge or not.
> 
> Once you add support for ndo_rx_vlan_{add,kill}_vid(), which ends up
> programming VLAN objects down the physical port, this is not a concern
> anymore because you can seamlessly support the following cases:
> 
> - 1 or more physical ports enslaved into a VLAN aware bridge, 1 or more
> physical ports not enslaved at all with, or without VLAN devices on top
> of these non-bridged physical ports
> 
> - all ports enslaved into a VLAN aware bridge, or multiple bridges, that
> all have the same VLAN filtering attributes (specific to my case here,
> obviously)
> 
> Does that make sense? Some switches like mv88e6xxx do support a per-port
> VLAN filtering/secure/unsecure option.
> 
> > 
> >> +
> >> +- with VLAN filtering turned off, these VLAN devices must be fully functional
> >> +  since the hardware is allowed VID frames. Enslaving VLAN devices into the
> > 
> > "the hardware is allowed VID frames" ?
> 
> I meant to write that the hardware is not doing any ingress VID
> checking, therefore, it allows any VID frame to ingress the physical
> switch port.
> 
> > 
> >> +  bridge might be allowed provided that there is sufficient separation using
> >> +  e.g.: a reserved VLAN ID (4095 for instance) for untagged traffic.
> >> +
> >> +- with VLAN filtering turned on, these VLAN devices should not be allowed to
> >> +  be created because they duplicate functionality/use case with the bridge's
> >> +  VLAN functionality.
> > 
> > We always allow VLAN devices to be created. It is just that we don't
> > allow their *enslavement* to VLAN-aware bridges.
> 
> If you have a bridge that is VLAN aware (br0), and you have a physical
> port enslaved in that bridge (sw0p0) and you create a VLAN device:
> sw0p0.100, it is equivalent to doing:
> 
> bridge vlan add vid 100 dev sw0p0
> bridge vlan add vid 100 dev br0 self
> ip link add name br0.100 link eth0 type vlan id 100
> 
> and use a VLAN device (br0.100) on top of the bridge, because if you do
> either of these two things, it means that you want the host to utilize
> those network interfaces.
> 
> Would you disagree? The difference is basically in the data path
> handling of the VLAN (sort of).

If sw0p0.100 is not present, then a packet with VID 100 received through
sw0p0 will be picked up by the bridge's Rx handler. The bridge will then
forward it.

If sw0p0.100 is present, then the same packet will not be forwarded by
the bridge. The VLAN device will untag it and inject it back into the Rx
path as if it was received by the VLAN device that does not have the
bridge's Rx handler set.

> 
> > 
> >> +
> >> +Because VLAN filtering can be turned on/off at runtime, the switchdev driver
> >> +must be able to re-configure the underlying hardware on the fly to honor the
> >> +toggling of that option and behave appropriately.
> >> +
> >> +A switchdev driver can also refuse to support dynamic toggling of the VLAN
> >> +filtering knob at runtime and require a destruction of the bridge device(s) and
> >> +creation of new bridge device(s) with a different VLAN filtering value to
> >> +ensure VLAN awareness is pushed down to the HW.
> >> +
> >> +IGMP snooping
> >> +~~~~~~~~~~~~~
> >> +
> >> +The Linux bridge allows the configuration of IGMP snooping (compile and run
> >> +time) which must be observed by the underlying switchdev network device/hardware
> >> +in the following way:
> >> +
> >> +- when IGMP snooping is turned off, multicast traffic must be flooded to all
> >> +  switch ports within the same broadcast domain. The CPU/management port
> >> +  should ideally not be flooded and continue to learn multicast traffic through
> >> +  the network stack notifications. If the hardware is not capable of doing that
> >> +  then the CPU/management port must also be flooded and multicast filtering
> >> +  happens in software.
> >> +
> >> +- when IGMP snooping is turned on, multicast traffic must selectively flow
> >> +  to the appropriate network ports (including CPU/management port) and not flood
> >> +  the switch.
> >> +
> >> +Note: reserved multicast addresses (e.g.: BPDUs) as well as Local Network
> >> +Control block (224.0.0.0 - 224.0.0.255) do not require IGMP and should always
> >> +be flooded.
> > 
> > I'm not sure that these paragraphs are actually needed. You're basically
> > describing RFC 4541 on which the IGMP snooping functionality in the
> > Linux bridge is based on.
> > 
> >> +
> >> +Because IGMP snooping can be turned on/off at runtime, the switchdev driver must
> >> +be able to re-configure the underlying hardware on the fly to honor the toggling
> >> +of that option and behave appropriately.
> >> +
> >> +A switchdev driver can also refuse to support dynamic toggling of the multicast
> >> +snooping knob at runtime and require the destruction of the bridge device(s)
> >> +and creation of a new bridge device(s) with a different multicast snooping
> >> +value.
> > 
> > You should probably get the patch that allows this vetoing merged before
> > sending this documentation patch.
> 
> Well, technically the switchdev attribute allows returning an error, it
> is just that we do not act (yet) on it in the bridge code, I can take
> that part out fo correctness for now and submit a patch to that
> documentation file once I submit the change to the bridge layer.

+1

> -- 
> Florian