[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c1ce21c6-1d9c-90f3-fef0-b023730bf271@nvidia.com>
Date: Mon, 14 Feb 2022 17:01:13 +0200
From: Nikolay Aleksandrov <nikolay@...dia.com>
To: Vladimir Oltean <vladimir.oltean@....com>
CC: Ido Schimmel <idosch@...sch.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Florian Fainelli <f.fainelli@...il.com>,
Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Vladimir Oltean <olteanv@...il.com>,
Roopa Prabhu <roopa@...dia.com>, Jiri Pirko <jiri@...dia.com>,
Ido Schimmel <idosch@...dia.com>,
Rafael Richter <rafael.richter@....de>,
Daniel Klauer <daniel.klauer@....de>,
"Tobias Waldekranz" <tobias@...dekranz.com>
Subject: Re: [RFC PATCH net-next 0/5] Replay and offload host VLAN entries in
DSA
On 14/02/2022 14:04, Vladimir Oltean wrote:
> On Mon, Feb 14, 2022 at 01:11:01PM +0200, Nikolay Aleksandrov wrote:
>> On 14/02/2022 12:42, Vladimir Oltean wrote:
>>> On Mon, Feb 14, 2022 at 12:27:57PM +0200, Nikolay Aleksandrov wrote:
>>>> On 14/02/2022 12:07, Vladimir Oltean wrote:
>>>>> On Mon, Feb 14, 2022 at 11:59:02AM +0200, Ido Schimmel wrote:
>>>>>> Sounds good to me as well. I assume it means patches #1 and #2 will be
>>>>>> changed to make use of this flag and patch #3 will be dropped?
>>>>>
>>>>> Not quite. Patches 1 and 2 will be kept as-is, since fundamentally,
>>>>> their goal is to eliminate a useless SWITCHDEV_PORT_OBJ_ADD event when
>>>>> really nothing has changed (flags == old_flags, no brentry was created).
>>>>>
>>>>
>>>> I don't think that's needed, a two-line change like
>>>> "vlan_already_exists == true && old_flags == flags then do nothing"
>>>> would do the trick in DSA for all drivers and avoid the churn of these patches.
>>>> It will also keep the order of the events consistent with (most of) the rest
>>>> of the bridge. I'd prefer the simpler change which avoids config reverts than
>>>> these two patches.
>>>
>>> I understand you prefer the simpler change which avoids reverting the
>>> struct net_bridge_vlan on error, but "vlan_already_exists == true &&
>>> old_flags == flags then do nothing" is not possible in DSA, because DSA
>>> does not know "old_flags" unless we also pass that via
>>> struct switchdev_obj_port_vlan. If we do that, I don't have much of an
>>> objection, but it still seems cleaner to me if the bridge didn't notify
>>> switchdev at all when it has nothing to say.
>>>
>>> Or where do you mean to place the two-line change?
>>
>> You mention a couple of times in both patches that you'd like to add dsa
>> vlan refcounting infra similar to dsa's host fdb and mdb. So I assumed that involves
>> keeping track of vlan entries in dsa? If so, then I thought you'd record the old flags there.
>>
>> Alternatively I don't mind passing old flags if you don't intend on keeping
>> vlan information in dsa, it's uglier but it will avoid the reverts which will
>> also avoid additional notifications when these cases are hit. It makes sense
>> to have both old flags and new flags if the switchdev checks are done pre-config
>> change so it can veto any transitions it can't handle for some reason.
>>
>> A third option is to do the flags check in the bridge and avoid doing the
>> switchdev call (e.g. in br_switchdev_port_vlan_ calls), that should avoid
>> the reverts as well.
>>
>> If you intend to add vlan refcounting in dsa, then I'd go with just keeping track
>> of the flags, you'll have vlan state anyway, otherwise it's up to you. I'm ok
>> with both options for old flags.
>
> I understand it can be confusing why DSA needs to keep VLAN refcounting
> yet it doesn't keep track of flags, so let me explain.
>
> First thing to mention is that VLAN flags on CPU and DSA (cascade) ports
> don't make much sense at the level of the DSA core. These ports are
> really pipes that transport packets between switches and between the
> switch and the CPU, so 'pvid' and 'untagged' don't really make sense.
> An unwritten convention is for DSA hardware drivers to program the CPU
> and DSA ports such that the VLAN information is unmodified w.r.t. how it
> was/will be on the wire. The only important aspect about VLAN on DSA and
> CPU ports is the VID membership list.
>
> This isn't the major reason, though, so secondly, I need to introduce a topology.
>
> eth0 eth1
> (DSA master) (foreign interface)
> |
> |
> | CPU port (no netdev)
> +----------------+
> | sw0p0 |
> | |
> | Switch 0 |
> | |
> | sw0p0 sw0p1 |
> +----------------+
> | | DSA/cascade port (no netdev)
> | |
> | |
> | | DSA/cascade port (no netdev)
> | +--------------+
> | | sw1p0 |
> | | |
> | | Switch 1 |
> | | |
> | |sw1p1 sw1p2 |
> | +--------------+
> user port | | |
> (has netdev) user port user port
>
> -----------------------------------------------------------
>
> Example 1: you want forwarding in VLAN 100 between sw0p0, sw1p1 and
> sw1p2, so you issue these commands:
>
> ip link add br0 type bridge vlan_filtering 1
> ip link set sw0p0 master br0 && bridge vlan add dev sw0p0 vid 100
> ip link set sw1p1 master br0 && bridge vlan add dev sw1p1 vid 100
> ip link set sw1p2 master br0 && bridge vlan add dev sw1p2 vid 100
>
> DSA must figure out that the sw0p1 and sw1p0 (the cascade ports
> interconnecting the 2 switches) must also be members of VID 100.
> So it must privately add each cascade port of a switch to this VID,
> as long as a user port of that switch is in VID 100.
>
> So the refcounting must be kept on the destination of where those VLANs
> are programmed (the cascade ports), not on the sources of where those
> VLANs came from (the user ports).
>
> In this example, sw0p1 and sw1p0 will be members of VID 100 and will
> have a refcount of 3 on it (it is needed by 3 user ports).
>
> -----------------------------------------------------------
>
> Example 2: you figure out that sw1p2 should in fact be pvid untagged in
> VID 100, so you change that:
>
> bridge vlan add dev sw1p2 pvid untagged
>
> DSA doesn't care about a change of flags, so it needs to change nothing
> about the DSA ports. It just needs to pass on the notification of the
> change of flags to the device driver for sw1p2.
>
> -----------------------------------------------------------
>
> Example 3: you realize that you know what, you don't want sw1p2 to be a
> member in VID 100 at all:
>
> bridge vlan del dev sw1p2 vid 100
>
> DSA must decrement the refcount of VID 100 on sw0p1 and sw1p0 from 3 to 2.
>
> -----------------------------------------------------------
>
> Example 4: you want local termination in VID 100 for this bridge, so you do:
>
> bridge vlan add dev br0 vid 100 self
>
> To ensure local termination works for all DSA user ports that are
> members of the bridge (sw0p0, sw1p1), DSA must ensure that the CPU port
> and the DSA ports towards it are members of VID 100.
>
> So sw0p0 joins VID 100 with a refcount of 1, while sw0p1 and sw1p0 just
> bump the refcount again from 2 to 3.
>
> -----------------------------------------------------------
>
> Example 5: you actually want to forward in VID 100 to a foreign
> interface as well:
>
> ip link set eth1 master br0 && bridge vlan add dev eth1 vid 100
>
> DSA must bump the refcount of VID 100 on sw0p0 (the CPU port) from 1 to 2,
> and for the cascade ports from 3 to 4. This is such that, if we decide
> that we no longer want local termination (example 4), VID 100 is not
> removed on those ports. Local termination vs software forwarding is
> identical from DSA's perspective, but different from the stack's perspective.
>
>
> The way this worked until now was by an approximation that held quite
> honorably more often than not: whenever a bridge VLAN is added to a user
> port, just add it to all DSA and CPU ports too, and never remove it.
> However, it gets quite limiting.
>
>
>
> So the way in which VLAN membership on CPU and DSA ports is refcounted
> makes it not really practical to keep original flags of each VLAN on
> each source port it came from. Your suggestion of keeping each vlan->flags
> in DSA practically boils down to keeping an entire copy of the bridge
> VLAN database. I hope it's clearer now why it isn't really the path I
> would like to follow.
>
I see, thank you for the examples and the detailed explanation.
> I can look into pruning useless calls in br_switchdev_port_vlan_add()
> itself, if that's what you prefer. Before looking closer at the code,
> that's what I originally had in mind, but it would pollute all callers
> of that function, since we'd have to provide a "bool existing" +
> "u16 old_flags" set of extra arguments to it, and most call paths would
> pass "false, 0". When I noticed that the call paths I'm interested in
> already compute "bool changed" based on __vlan_add_flags(), I thought it
> would be a bit redundant to duplicate logic from that function.
> Just reorder the call to __vlan_add_flags() and I have the info I need.
You already have to add "existing", so I don't see why not add
old flags as well. As I mentioned it makes sense to have it for pre-config
veto of unsupported state transitions. I think it will be simpler and
we'll avoid the config reverts and more notifications.
Cheers,
Nik
Powered by blists - more mailing lists