lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 14 Feb 2022 17:01:13 +0200
From:   Nikolay Aleksandrov <nikolay@...dia.com>
To:     Vladimir Oltean <vladimir.oltean@....com>
CC:     Ido Schimmel <idosch@...sch.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Vladimir Oltean <olteanv@...il.com>,
        Roopa Prabhu <roopa@...dia.com>, Jiri Pirko <jiri@...dia.com>,
        Ido Schimmel <idosch@...dia.com>,
        Rafael Richter <rafael.richter@....de>,
        Daniel Klauer <daniel.klauer@....de>,
        "Tobias Waldekranz" <tobias@...dekranz.com>
Subject: Re: [RFC PATCH net-next 0/5] Replay and offload host VLAN entries in
 DSA

On 14/02/2022 14:04, Vladimir Oltean wrote:
> On Mon, Feb 14, 2022 at 01:11:01PM +0200, Nikolay Aleksandrov wrote:
>> On 14/02/2022 12:42, Vladimir Oltean wrote:
>>> On Mon, Feb 14, 2022 at 12:27:57PM +0200, Nikolay Aleksandrov wrote:
>>>> On 14/02/2022 12:07, Vladimir Oltean wrote:
>>>>> On Mon, Feb 14, 2022 at 11:59:02AM +0200, Ido Schimmel wrote:
>>>>>> Sounds good to me as well. I assume it means patches #1 and #2 will be
>>>>>> changed to make use of this flag and patch #3 will be dropped?
>>>>>
>>>>> Not quite. Patches 1 and 2 will be kept as-is, since fundamentally,
>>>>> their goal is to eliminate a useless SWITCHDEV_PORT_OBJ_ADD event when
>>>>> really nothing has changed (flags == old_flags, no brentry was created).
>>>>>
>>>>
>>>> I don't think that's needed, a two-line change like
>>>> "vlan_already_exists == true && old_flags == flags then do nothing"
>>>> would do the trick in DSA for all drivers and avoid the churn of these patches.
>>>> It will also keep the order of the events consistent with (most of) the rest
>>>> of the bridge. I'd prefer the simpler change which avoids config reverts than
>>>> these two patches.
>>>
>>> I understand you prefer the simpler change which avoids reverting the
>>> struct net_bridge_vlan on error, but "vlan_already_exists == true &&
>>> old_flags == flags then do nothing" is not possible in DSA, because DSA
>>> does not know "old_flags" unless we also pass that via
>>> struct switchdev_obj_port_vlan. If we do that, I don't have much of an
>>> objection, but it still seems cleaner to me if the bridge didn't notify
>>> switchdev at all when it has nothing to say.
>>>
>>> Or where do you mean to place the two-line change?
>>
>> You mention a couple of times in both patches that you'd like to add dsa
>> vlan refcounting infra similar to dsa's host fdb and mdb. So I assumed that involves
>> keeping track of vlan entries in dsa? If so, then I thought you'd record the old flags there.
>>
>> Alternatively I don't mind passing old flags if you don't intend on keeping
>> vlan information in dsa, it's uglier but it will avoid the reverts which will
>> also avoid additional notifications when these cases are hit. It makes sense
>> to have both old flags and new flags if the switchdev checks are done pre-config
>> change so it can veto any transitions it can't handle for some reason.
>>
>> A third option is to do the flags check in the bridge and avoid doing the
>> switchdev call (e.g. in br_switchdev_port_vlan_ calls), that should avoid
>> the reverts as well.
>>
>> If you intend to add vlan refcounting in dsa, then I'd go with just keeping track
>> of the flags, you'll have vlan state anyway, otherwise it's up to you. I'm ok
>> with both options for old flags. 
> 
> I understand it can be confusing why DSA needs to keep VLAN refcounting
> yet it doesn't keep track of flags, so let me explain.
> 
> First thing to mention is that VLAN flags on CPU and DSA (cascade) ports
> don't make much sense at the level of the DSA core. These ports are
> really pipes that transport packets between switches and between the
> switch and the CPU, so 'pvid' and 'untagged' don't really make sense.
> An unwritten convention is for DSA hardware drivers to program the CPU
> and DSA ports such that the VLAN information is unmodified w.r.t. how it
> was/will be on the wire. The only important aspect about VLAN on DSA and
> CPU ports is the VID membership list.
> 
> This isn't the major reason, though, so secondly, I need to introduce a topology.
> 
>                eth0                                           eth1
>            (DSA master)                                (foreign interface)
>                 |
>                 |
>                 | CPU port (no netdev)
>         +----------------+
>         |     sw0p0      |
>         |                |
>         |    Switch 0    |
>         |                |
>         | sw0p0   sw0p1  |
>         +----------------+
>             |       | DSA/cascade port (no netdev)
>             |       |
>             |       |
>             |       | DSA/cascade port (no netdev)
>             |    +--------------+
>             |    | sw1p0        |
>             |    |              |
>             |    |   Switch 1   |
>             |    |              |
>             |    |sw1p1  sw1p2  |
>             |    +--------------+
>  user port  |       |      |
> (has netdev)   user port  user port
> 
> -----------------------------------------------------------
> 
> Example 1: you want forwarding in VLAN 100 between sw0p0, sw1p1 and
> sw1p2, so you issue these commands:
> 
> ip link add br0 type bridge vlan_filtering 1
> ip link set sw0p0 master br0 && bridge vlan add dev sw0p0 vid 100
> ip link set sw1p1 master br0 && bridge vlan add dev sw1p1 vid 100
> ip link set sw1p2 master br0 && bridge vlan add dev sw1p2 vid 100
> 
> DSA must figure out that the sw0p1 and sw1p0 (the cascade ports
> interconnecting the 2 switches) must also be members of VID 100.
> So it must privately add each cascade port of a switch to this VID,
> as long as a user port of that switch is in VID 100.
> 
> So the refcounting must be kept on the destination of where those VLANs
> are programmed (the cascade ports), not on the sources of where those
> VLANs came from (the user ports).
> 
> In this example, sw0p1 and sw1p0 will be members of VID 100 and will
> have a refcount of 3 on it (it is needed by 3 user ports).
> 
> -----------------------------------------------------------
> 
> Example 2: you figure out that sw1p2 should in fact be pvid untagged in
> VID 100, so you change that:
> 
> bridge vlan add dev sw1p2 pvid untagged
> 
> DSA doesn't care about a change of flags, so it needs to change nothing
> about the DSA ports. It just needs to pass on the notification of the
> change of flags to the device driver for sw1p2.
> 
> -----------------------------------------------------------
> 
> Example 3: you realize that you know what, you don't want sw1p2 to be a
> member in VID 100 at all:
> 
> bridge vlan del dev sw1p2 vid 100
> 
> DSA must decrement the refcount of VID 100 on sw0p1 and sw1p0 from 3 to 2.
> 
> -----------------------------------------------------------
> 
> Example 4: you want local termination in VID 100 for this bridge, so you do:
> 
> bridge vlan add dev br0 vid 100 self
> 
> To ensure local termination works for all DSA user ports that are
> members of the bridge (sw0p0, sw1p1), DSA must ensure that the CPU port
> and the DSA ports towards it are members of VID 100.
> 
> So sw0p0 joins VID 100 with a refcount of 1, while sw0p1 and sw1p0 just
> bump the refcount again from 2 to 3.
> 
> -----------------------------------------------------------
> 
> Example 5: you actually want to forward in VID 100 to a foreign
> interface as well:
> 
> ip link set eth1 master br0 && bridge vlan add dev eth1 vid 100
> 
> DSA must bump the refcount of VID 100 on sw0p0 (the CPU port) from 1 to 2,
> and for the cascade ports from 3 to 4. This is such that, if we decide
> that we no longer want local termination (example 4), VID 100 is not
> removed on those ports. Local termination vs software forwarding is
> identical from DSA's perspective, but different from the stack's perspective.
> 
> 
> The way this worked until now was by an approximation that held quite
> honorably more often than not: whenever a bridge VLAN is added to a user
> port, just add it to all DSA and CPU ports too, and never remove it.
> However, it gets quite limiting.
> 
> 
> 
> So the way in which VLAN membership on CPU and DSA ports is refcounted
> makes it not really practical to keep original flags of each VLAN on
> each source port it came from. Your suggestion of keeping each vlan->flags
> in DSA practically boils down to keeping an entire copy of the bridge
> VLAN database. I hope it's clearer now why it isn't really the path I
> would like to follow.
> 

I see, thank you for the examples and the detailed explanation.

> I can look into pruning useless calls in br_switchdev_port_vlan_add()
> itself, if that's what you prefer. Before looking closer at the code,
> that's what I originally had in mind, but it would pollute all callers
> of that function, since we'd have to provide a "bool existing" +
> "u16 old_flags" set of extra arguments to it, and most call paths would
> pass "false, 0". When I noticed that the call paths I'm interested in
> already compute "bool changed" based on __vlan_add_flags(), I thought it
> would be a bit redundant to duplicate logic from that function.
> Just reorder the call to __vlan_add_flags() and I have the info I need.

You already have to add "existing", so I don't see why not add
old flags as well. As I mentioned it makes sense to have it for pre-config
veto of unsupported state transitions. I think it will be simpler and
we'll avoid the config reverts and more notifications.

Cheers,
 Nik

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ