[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+h21hqZAUccQ3vsJTFtv8bifC_Uv9cmL5WXM7DO7EwERk3GFg@mail.gmail.com>
Date: Thu, 26 Mar 2020 02:30:52 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Florian Fainelli <f.fainelli@...il.com>
Cc: Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <jakub.kicinski@...ronome.com>,
murali.policharla@...adcom.com,
Stephen Hemminger <stephen@...workplumber.org>,
Jiri Pirko <jiri@...nulli.us>,
Ido Schimmel <idosch@...sch.org>,
Jakub Kicinski <kuba@...nel.org>,
Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH v2 net-next 10/10] net: bridge: implement
auto-normalization of MTU for hardware datapath
On Thu, 26 Mar 2020 at 01:18, Florian Fainelli <f.fainelli@...il.com> wrote:
>
>
>
> On 3/25/2020 8:22 AM, Vladimir Oltean wrote:
> > From: Vladimir Oltean <vladimir.oltean@....com>
> >
> > In the initial attempt to add MTU configuration for DSA:
> >
> > https://patchwork.ozlabs.org/cover/1199868/
> >
> > Florian raised a concern about the bridge MTU normalization logic (when
> > you bridge an interface with MTU 9000 and one with MTU 1500). His
> > expectation was that the bridge would automatically change the MTU of
> > all its slave ports to the minimum MTU, if those slaves are part of the
> > same hardware bridge. However, it doesn't do that, and for good reason,
> > I think. What br_mtu_auto_adjust() does is it adjusts the MTU of the
> > bridge net device itself, and not that of any slave port. If it were to
> > modify the MTU of the slave ports, the effect would be that the user
> > wouldn't be able to increase the MTU of any bridge slave port as long as
> > it was part of the bridge, which would be a bit annoying to say the
> > least.
> >
> > The idea behind this behavior is that normal termination from Linux over
> > the L2 forwarding domain described by DSA should happen over the bridge
> > net device, which _is_ properly limited by the minimum MTU. And
> > termination over individual slave device is possible even if those are
> > bridged. But that is not "forwarding", so there's no reason to do
> > normalization there, since only a single interface sees that packet.
> >
> > The real problem is with the offloaded data path, where of course, the
> > bridge net device MTU is ignored. So a packet received on an interface
> > with MTU 9000 would still be forwarded to an interface with MTU 1500.
> > And that is exactly what this patch is trying to prevent from happening.
> >
> > Florian's idea was that all hardware ports having the same
> > netdev_port_same_parent_id should be adjusted to have the same MTU.
> > The MTU that we attempt to configure the ports to is the most recently
> > modified MTU. The attempt is to follow user intention as closely as
> > possible and not be annoying at that.
> >
> > So there are 2 cases really:
> >
> > ip link set dev sw0p0 master br0
> > ip link set dev sw0p1 mtu 1400
> > ip link set dev sw0p1 master br0
> >
> > The above sequence will make sw0p0 inherit MTU 1400 as well.
> >
> > The second case:
> >
> > ip link set dev sw0p0 master br0
> > ip link set dev sw0p1 master br0
> > ip link set dev sw0p0 mtu 1400
> >
> > This sequence will make sw0p1 inherit MTU 1400 from sw0p0.
> >
> > Suggested-by: Florian Fainelli <f.fainelli@...il.com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> > ---
> > net/bridge/br.c | 1 +
> > net/bridge/br_if.c | 93 +++++++++++++++++++++++++++++++++++++++++
> > net/bridge/br_private.h | 1 +
> > 3 files changed, 95 insertions(+)
> >
> > diff --git a/net/bridge/br.c b/net/bridge/br.c
> > index b6fe30e3768f..5f05380df1ee 100644
> > --- a/net/bridge/br.c
> > +++ b/net/bridge/br.c
> > @@ -57,6 +57,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
> >
> > switch (event) {
> > case NETDEV_CHANGEMTU:
> > + br_mtu_normalization(br, dev);
>
> I do not remember if you are allowed to sleep in a netdevice notifier, I
> believe not, so you may need to pass a gfp_t to br_mtu_normalization for
> allocations to be GFP_ATOMIC when called from that context, and
> GFP_KERNEL from br_add_if().
>
Not only can you sleep, but the RTNL is also held. It's a bliss!
> It would be nice if we could avoid doing these allocations when called
> from the netdev notifier though, could we just keep the information
> around since the br_hw_port follows the same lifetime as the
> net_bridge_port structure. Other than that, this looks good to me, thanks!
So does this comment still apply then?
I wanted to be as self-contained as possible, it's a pain to keep the
old_mtu list for rollback as the code probably shows. I wouldn't go as
far as to add more stuff into struct net_bridge_port for the sole
purpose of passing data around in this function.
> --
> Florian
Thanks,
-Vladimir
Powered by blists - more mailing lists