lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 16 Jun 2022 10:08:20 -0700
From:   Maciej Żenczykowski <maze@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     "Subash Abhinov Kasiviswanathan (KS)" <quic_subashab@...cinc.com>,
        "David S. Miller" <davem@...emloft.net>,
        David Ahern <dsahern@...nel.org>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Linux NetDev <netdev@...r.kernel.org>,
        Stefano Brivio <sbrivio@...hat.com>,
        Kaustubh Pandey <quic_kapandey@...cinc.com>,
        Sean Tranchetti <quic_stranche@...cinc.com>
Subject: Re: [PATCH net v2 1/2] ipv6: Honor route mtu if it is within limit of
 dev mtu

On Thu, Jun 16, 2022 at 9:42 AM Jakub Kicinski <kuba@...nel.org> wrote:
> On Thu, 16 Jun 2022 00:33:02 -0700 Maciej Żenczykowski wrote:
> > On Wed, Jun 15, 2022 at 10:36 PM Subash Abhinov Kasiviswanathan (KS) <quic_subashab@...cinc.com> wrote:
> > > >> CC maze, please add him if there is v3
> > > >>
> > > >> I feel like the problem is with the fact that link mtu resets protocol
> > > >> MTUs. Nothing we can do about that, so why not set link MTU to 9k (or
> > > >> whatever other quantification of infinity there is) so you don't have
> > > >> to touch it as you discover the MTU for v4 and v6?
> > >
> > > That's a good point.
> >
> > Because link mtu affects rx mtu which affects nic buffer allocations.
> > Somewhere in the vicinity of mtu 1500..2048 your packets stop fitting
> > in 2kB of memory and need 4kB (or more)
>
> I was afraid someone would point that out :) Luckily the values Subash
> mentioned were both under 2k, and hope fully the device can do scatter?
> 🤞😟 (Don't modems do LRO or some other form of aggregation usually?)

You'd be amazed at how ...minimal... these (cellphone modem/wifi) devices are.

I've long given up on expecting these devices to do fundamental things
like scatter gather, or transmit or receive checksum offload.

Sure *some* newer ones are better and can even do TSO or some form of
HWGRO, maybe even some limited multiqueue, but it's rare.

note: > ~3.5kB mtu also breaks (or at least used to???) xdp, because
of that requiring a single page.

Additionally, a severe lack of trust in cell/wifi firmware's ability
to withstand remote compromise (due to inability to audit the source
code), means sometimes the nics even lack DMA access to system ram,
and instead either rx, or both rx and tx are bounce buffered, either
by vritue of driver doing memcpy or some separate hw dma engine.

> > > >> My worry is that the tweaking of the route MTU update heuristic will
> > > >> have no end.
> > > >>
> > > >> Stefano, does that makes sense or you think the change is good?
> > >
> > > The only concern is that current behavior causes the initial packets
> > > after interface MTU increase to get dropped as part of PMTUD if the IPv6
> > > PMTU itself didn't increase. I am not sure if that was the intended
> > > behavior as part of the original change. Stefano, could you please confirm?
> > >
> > > > I vaguely recall that if you don't want device mtu changes to affect
> > > > ipv6 route mtu, then you should set 'mtu lock' on the routes.
> > > > (this meaning of 'lock' for v6 is different than for ipv4, where
> > > > 'lock' means transmit IPv4/TCP with Don't Frag bit unset)
> > >
> > > I assume 'mtu lock' here refers to setting the PMTU on the IPv6 routes
> > > statically. The issue with that approach is that router advertisements
> > > can no longer update PMTU once a static route is configured.
> >
> > yeah.   Hmm should RA generated routes use locked mtu too?
> > I think the only reason an RA generated route would have mtu information
> > is for it to stick...
>
> If link MTU is lower than RA MTU do we do min() or ignore the RA MTU?

I think we simply ignore it - if link mtu is changed, we'll update the
routes on the next RA we receive (which will presumably have the mtu
information again).
Perhaps link mtu change should result in immediate RS to get an RA soon??

Behaviour for mtu > link mtu heavily depends on the driver.
For RX many drivers will fail to receive packets larger than link mtu
(rx buffer overrun), but often there's some wiggle room - this is due
to how rx buffers are allocated.
ie. 1500 mtu means can receive up to 1536 byte packets...

For tx it again depends on the driver, some reject packets > mtu,
others can actually send arbitrary sized packets (up to some limit
like 64KB or even higher), because tx allocation does not require any
statically sized buffers like receive does.

All this means that ultimately route MTU > link mtu is unlikely to
work no matter what we do - on at least some nics.

Anyway... lots of words to say ignore 'RA MTU > link/device mtu' seems
like the right call,
while if it is <= then set it as locked?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ