netdev - Re: [PATCH v2 net-next 10/10] net: bridge: implement auto-normalization of MTU for hardware datapath

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8d2d819c-328c-9b2a-d25b-dccc85b93735@gmail.com>
Date:   Wed, 25 Mar 2020 16:17:54 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>, andrew@...n.ch,
        vivien.didelot@...il.com, davem@...emloft.net,
        jakub.kicinski@...ronome.com
Cc:     murali.policharla@...adcom.com, stephen@...workplumber.org,
        jiri@...nulli.us, idosch@...sch.org, kuba@...nel.org,
        nikolay@...ulusnetworks.com, netdev@...r.kernel.org
Subject: Re: [PATCH v2 net-next 10/10] net: bridge: implement
 auto-normalization of MTU for hardware datapath



On 3/25/2020 8:22 AM, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@....com>
> 
> In the initial attempt to add MTU configuration for DSA:
> 
> https://patchwork.ozlabs.org/cover/1199868/
> 
> Florian raised a concern about the bridge MTU normalization logic (when
> you bridge an interface with MTU 9000 and one with MTU 1500). His
> expectation was that the bridge would automatically change the MTU of
> all its slave ports to the minimum MTU, if those slaves are part of the
> same hardware bridge. However, it doesn't do that, and for good reason,
> I think. What br_mtu_auto_adjust() does is it adjusts the MTU of the
> bridge net device itself, and not that of any slave port.  If it were to
> modify the MTU of the slave ports, the effect would be that the user
> wouldn't be able to increase the MTU of any bridge slave port as long as
> it was part of the bridge, which would be a bit annoying to say the
> least.
> 
> The idea behind this behavior is that normal termination from Linux over
> the L2 forwarding domain described by DSA should happen over the bridge
> net device, which _is_ properly limited by the minimum MTU. And
> termination over individual slave device is possible even if those are
> bridged. But that is not "forwarding", so there's no reason to do
> normalization there, since only a single interface sees that packet.
> 
> The real problem is with the offloaded data path, where of course, the
> bridge net device MTU is ignored. So a packet received on an interface
> with MTU 9000 would still be forwarded to an interface with MTU 1500.
> And that is exactly what this patch is trying to prevent from happening.
> 
> Florian's idea was that all hardware ports having the same
> netdev_port_same_parent_id should be adjusted to have the same MTU.
> The MTU that we attempt to configure the ports to is the most recently
> modified MTU. The attempt is to follow user intention as closely as
> possible and not be annoying at that.
> 
> So there are 2 cases really:
> 
> ip link set dev sw0p0 master br0
> ip link set dev sw0p1 mtu 1400
> ip link set dev sw0p1 master br0
> 
> The above sequence will make sw0p0 inherit MTU 1400 as well.
> 
> The second case:
> 
> ip link set dev sw0p0 master br0
> ip link set dev sw0p1 master br0
> ip link set dev sw0p0 mtu 1400
> 
> This sequence will make sw0p1 inherit MTU 1400 from sw0p0.
> 
> Suggested-by: Florian Fainelli <f.fainelli@...il.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> ---
>  net/bridge/br.c         |  1 +
>  net/bridge/br_if.c      | 93 +++++++++++++++++++++++++++++++++++++++++
>  net/bridge/br_private.h |  1 +
>  3 files changed, 95 insertions(+)
> 
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index b6fe30e3768f..5f05380df1ee 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -57,6 +57,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
>  
>  	switch (event) {
>  	case NETDEV_CHANGEMTU:
> +		br_mtu_normalization(br, dev);

I do not remember if you are allowed to sleep in a netdevice notifier, I
believe not, so you may need to pass a gfp_t to br_mtu_normalization for
allocations to be GFP_ATOMIC when called from that context, and
GFP_KERNEL from br_add_if().

It would be nice if we could avoid doing these allocations when called
from the netdev notifier though, could we just keep the information
around since the br_hw_port follows the same lifetime as the
net_bridge_port structure. Other than that, this looks good to me, thanks!
-- 
Florian