lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 07 Oct 2020 20:19:39 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Maciej Żenczykowski <maze@...gle.com>,
        John Fastabend <john.fastabend@...il.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        bpf <bpf@...r.kernel.org>, Linux NetDev <netdev@...r.kernel.org>,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Shaun Crampton <shaun@...era.io>,
        Lorenzo Bianconi <lorenzo@...nel.org>,
        Marek Majkowski <marek@...udflare.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Eyal Birger <eyal.birger@...il.com>
Subject: Re: [PATCH bpf-next V2 5/6] bpf: Add MTU check for TC-BPF packets
 after egress hook

Maciej Żenczykowski wrote:
> On Wed, Oct 7, 2020 at 3:37 PM John Fastabend <john.fastabend@...il.com> wrote:
> >
> > Daniel Borkmann wrote:
> > > On 10/7/20 6:23 PM, Jesper Dangaard Brouer wrote:
> > > [...]
> > > >   net/core/dev.c |   24 ++++++++++++++++++++++--
> > > >   1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > Couple high-level comments. Whats the problem with just letting the driver
> > consume the packet? I would chalk it up to a buggy BPF program that is
> > sending these packets. The drivers really shouldn't panic or do anything
> > horrible under this case because even today I don't think we can be
> > 100% certain MTU on skb matches set MTU. Imagine the case where I change
> > the MTU from 9kB->1500B there will be some skbs in-flight with the larger
> > length and some with the shorter. If the drivers panic/fault or otherwise
> > does something else horrible thats not going to be friendly in general case
> > regardless of what BPF does. And seeing this type of config is all done
> > async its tricky (not practical) to flush any skbs in-flight.
> >
> > I've spent many hours debugging these types of feature flag, mtu
> > change bugs on the driver side I'm not sure it can be resolved by
> > the stack easily. Better to just build drivers that can handle it IMO.
> >
> > Do we know if sending >MTU size skbs to drivers causes problems in real
> > cases? I haven't tried on the NICs I have here, but I expect they should
> > be fine. Fine here being system keeps running as expected. Dropping the
> > skb either on TX or RX side is expected. Even with this change though
> > its possible for the skb to slip through if I configure MTU on a live
> > system.
> 
> I wholeheartedly agree with the above.
> 
> Ideally the only >mtu check should happen at driver admittance.
> But again ideally it should happen in some core stack location not in
> the driver itself.

Ideally maybe, but IMO we should just let the skb go to the driver
and let the driver sort it out. Even if this means pushing the packet
onto the wire then the switch will drop it or the receiver, etc. A
BPF program can do lots of horrible things that should never be
on the wire otherwise. MTU is just one of them, but sending corrupted
payloads, adding bogus headers, checksums etc. so I don't think we can
reasonable protect against all of them.

Of course if the driver is going to hang/panic then something needs
to be done. Perhaps a needs_mtu_check feature flag, although thats
not so nice either so perhaps drivers just need to handle it themselves.
Also even today the case could happen without BPF as best I can tell
so the drivers should be prepared for it.

> However, due to both gso and vlan offload, even this is not trivial to do...
> The mtu is L3, but drivers/hardware/the wire usually care about L2...
> (because ultimately that's what gets received and must fit in receive buffers)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ