lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOEp5Odj=_8Tj2CFNirdNEkA_yx-_8MJJKtpiX_dcQrtn6P76Q@mail.gmail.com>
Date: Thu, 11 Apr 2024 12:39:36 +0300
From: Yuri Benditovich <yuri.benditovich@...nix.com>
To: Jason Wang <jasowang@...hat.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>, yan@...nix.com, 
	andrew@...nix.com, netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net] net: change maximum number of UDP segments to 128

On Thu, Apr 11, 2024 at 8:53 AM Jason Wang <jasowang@...hat.com> wrote:
>
> On Thu, Apr 11, 2024 at 1:32 PM Yuri Benditovich
> <yuri.benditovich@...nix.com> wrote:
> >
> > On Thu, Apr 11, 2024 at 7:04 AM Jason Wang <jasowang@...hat.com> wrote:
> > >
> > > On Wed, Apr 10, 2024 at 4:28 PM Yuri Benditovich
> > > <yuri.benditovich@...nix.com> wrote:
> > > >
> > > > On Wed, Apr 10, 2024 at 9:07 AM Jason Wang <jasowang@...hat.com> wrote:
> > > > >
> > > > > On Tue, Apr 9, 2024 at 1:48 PM Yuri Benditovich
> > > > > <yuri.benditovich@...nix.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Apr 9, 2024 at 6:53 AM Jason Wang <jasowang@...hat.com> wrote:
> > > > > >>
> > > > > >> On Mon, Apr 8, 2024 at 3:24 PM Yuri Benditovich
> > > > > >> <yuri.benditovich@...nix.com> wrote:
> > > > > >> >
> > > > > >> > On Mon, Apr 8, 2024 at 9:27 AM Jason Wang <jasowang@...hat.com> wrote:
> > > > > >> > >
> > > > > >> > > On Sun, Apr 7, 2024 at 2:50 AM Yuri Benditovich
> > > > > >> > > <yuri.benditovich@...nix.com> wrote:
> > > > > >> > > >
> > > > > >> > > > Fixes: fc8b2a619469378 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation")
> > > > > >> > > >
> > > > > >> > > > The mentioned above commit adds check of potential number
> > > > > >> > > > of UDP segments vs UDP_MAX_SEGMENTS in linux/virtio_net.h.
> > > > > >> > > > After this change certification test of USO guest-to-guest
> > > > > >> > > > transmit on Windows driver for virtio-net device fails,
> > > > > >> > > > for example with packet size of ~64K and mss of 536 bytes.
> > > > > >> > > > In general the USO should not be more restrictive than TSO.
> > > > > >> > > > Indeed, in case of unreasonably small mss a lot of segments
> > > > > >> > > > can cause queue overflow and packet loss on the destination.
> > > > > >> > > > Limit of 128 segments is good for any practical purpose,
> > > > > >> > > > with minimal meaningful mss of 536 the maximal UDP packet will
> > > > > >> > > > be divided to ~120 segments.
> > > > > >> > >
> > > > > >> > > Assuming different OS guests could run on top of KVM. I wonder if a
> > > > > >> > > better fix is to relax the following check:
> > > > > >> > >
> > > > > >> > > =>                      if (skb->len - p_off > gso_size * UDP_MAX_SEGMENTS)
> > > > > >> > >                                 return -EINVAL;
> > > > > >> > There are also checks vs UDP_MAX_SEGMENTS in udp.c (in ipv4 and ipv6),
> > > > > >> > they do not prevent guest-to-guest from passing but cause packet dropping in
> > > > > >> > other cases..
> > > > > >> > >
> > > > > >> > > Changing UDP_MAX_SEGMENTS may have side effects. For example, a new
> > > > > >> > > Linux guest run on top of a old Linux and other.
> > > > > >> > IMO, in the worst case _in specific setups_ the communication will behave like
> > > > > >> > it does now in all the setups.
> > > > > >>
> > > > > >> I meant if the guest limit is 128 but host limit is 64.
> > > > > >
> > > > > >
> > > > > > If the guest limit is (128 or 64) and host limit is 64:
> > > > > > If we send a UDP packet  with USO (length 64K, mss 600) - it is dropped.
> > > > > > setsockopt does not limit us to use <=64 packets, neither in windows nor in Linux,
> > > > >
> > > > > Just to make sure we are on the same page:
> > > > >
> > > > > Before fc8b2a619469378,
> > > > >
> > > > > 1) Windows guest works on Linux Host since we don't check against
> > > > > UDP_MAX_SEGMENTS on the host
> > > >
> > > >
> > > > Windows guest-to-guest works since there  is no check for
> > > > UDP_MAX_SEGMENTS in virtio_net.h
> > > > Windows guest to Linux host suffers with size=64K mss=536 (as an example)
> > > >
> > > > > 2) Linux guest works on Linux Host, since it will always send packet
> > > > > with less than UDP_MAX_SEGMENTS
> > > >
> > > >
> > > > Not exactly.
> > > > If you use "selftest" (udpgso*) _as is_, it will work as it has
> > > > _internal_ define of 64 and never tries to do more than that.
> > > > But (sorry for repeating that) there is no setsockopt() limitation and
> > > > you're free to modify the test code
> > > > to use more segments (as an example) and this will not work.
> > >
> > > I meant this part in udp_send_skb()
> > >
> > > =>              if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) {
> > >                         kfree_skb(skb);
> > >                         return -EINVAL;
> > >                 }
> > >
> > > >
> > > > >
> > > > >
> > > > > This is the behaviour that we need to stick to, otherwise we break the
> > > > > guests as you've noticed.
> > > >
> > > > We can stick to the behavior I've described above if we change only define
> > > > UDP_MAX_SEGMENTS 64->128 in udp.h but leave untouched
> > > > UDP_MAX_SEGMENTS=64 in udpgso (test code)
> > > > (IMO does not make too much sense but possible)
> > > >
> > > > Actually, all this discussion happens because the initial code of USO
> > > > in the kernel was delivered with define of 64 without any special
> > > > reason, just because this looked enough then.
> > >
> > > Right, but it's too late to "fix", the only thing we can do now is to
> > > make sure to not break the application that worked in the past.
> > >
> > > >
> > > > >
> > > > > If we fix it by increasing the UDP_MAX_SEGMENTS, it fixes for 1) but
> > > > > not necessarily for 2). If we don't check against UDP_MAX_SEGMENTS, it
> > > > > works for 2) as well.
> > > >
> > >
> > > So what I want to say is that having a check for UDP_MAX_SEGMENTS
> > > seems to be problematic. And increasing it to 128 complicates the
> > > problem furtherly.
> > >
> > > > There are following cases:
> > > > - guest-to-guest (win-to-win, lin-to-lin,win-to-lin, lin-to-win)
> > >
> > > If we don't have the check for UDP_MAX_SEGMENTS, everything should be fine.
> > >
> > > If the sender can produces 128 and 64 is checked in the middle of the
> > > datapath in either tun or virtio-net, the packet will be dropped.
> > >
> > > > -guest-to-host (guest=lin, guest=win)
> > >
> > > Same as above.
> > >
> > > > -host-to-host
> > >
> > > I don't see any issue in this part.
> >
> > The issue is that host-to-host UDP packets with more than 64 segments
> > are dropped due to check vs  UDP_MAX_SEGMENTS is udp.c (v4 and v6)
>
> Yes, but if they fail since the introduction of UDP gso if I was not wrong.

In order to prevent failure we can limit the test to work with 64 segments
and it will be happy. Nothing will be broken.

>
> >
> >
> > >
> > > > -host-to-guest (guest=lin, guest=win)
> > >
> > > Same as guest-to-guest.
> > >
> > > >
> > > > Resulting table of results is much more complicated than you described.
> > > > We can't use the udpgso code as the only indicator of go/nogo.
> > >
> > > We should not break userspace. If there's a setup that udpgso can work
> > > in the past but not after a patch is applied, we should avoid such
> > > cases.
> > >
> > > That's why I think avoiding checking UDP_MAX_SEGMENTS and doing other
> > > hardening might be better (for example as Eric suggested).
> >
> > I've checked all this thread and I did not find any Eric's suggestion regarding
> > other hardening. Which suggestion have you mentioned?
>
> Quoted from Eric email (btw, for some reason the discussion is not
> reached to the list, adding list here)
>
> """
>
> > Assuming different OS guests could run on top of KVM. I wonder if a
> > better fix is to relax the following check:
> >
> > =>                      if (skb->len - p_off > gso_size * UDP_MAX_SEGMENTS)
> >                                 return -EINVAL;
> >
> > Changing UDP_MAX_SEGMENTS may have side effects. For example, a new
> > Linux guest run on top of a old Linux and other.
>
> Typical qdisc packet limit is 1000.
>
> I think a limit prevents abuses, like gso_size == 1, and skb->len = 65000

I understand. This was your suggestion to remove the check here.
Then with > 64 segments:
- windows guest-to-guest will work
- host to windows guest will not work
- linux guest to windows guest will not work
- other pairs - I do not know yet

This is what you prefer??


>
> """
>
> The actual value needs more thought, for example 65K may block the
> implementation of jumbogram in the future.
>
> Thanks
>
> >
> > > >
> > > > >
> > > > > >
> > > > > >>
> > > > > >>
> > > > > >> > Note that this limit (UDP_MAX_SEGMENTS) is set
> > > > > >> > in internal kernel define, it is not based on any RFC, not visible via
> > > > > >> > ethtool etc.
> > > > > >>
> > > > > >> In the future, it needs to be advertised to the guest via virtio spec.
> > > > > >> Otherwise it might have compatibility issues.
> > > > > >
> > > > > >
> > > > > > The spec targets 2 areas - hardware solutions and software solutions.
> > > > > > For hardware solutions everything is clear - there are no declared limits and if the hardware will limit the number of segments to anything - it is a bug.
> > > > > > So the advertising makes sense only for hypervisors and for example, qemu will need to obtain this magic number from the kernel.
> > > > >
> > > > > Right.
> > > > >
> > > > > > If/when we'll start adding this to the spec we'll face a reasonable question - why do we need to add something that is not mentioned in any RFC, why not add such a thing for TCP also?
> > > > >
> > > > > So the kernel has gso_max_segs for netdevice. We probably need it for
> > > > > virtio-net as well, and it might be useful for jumbograms as well.
> > > > >
> > > > virtio-net is "limited" by common "#define GSO_MAX_SEGS        65535u"
> > >
> > > This works since the host will do software segmentation as a fallback
> > > which will be very slow.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Next, let's say we advertise this number over virtio capabilities - does this solve something?
> > > > >
> > > > > For example, doing software segmentation by kernel?
> > > > >
> > > > > > The driver can't communicate this capability up. The setsockopt will not limit this parameter.
> > > > > >
> > > > > >>
> > > > > >> For example, migrate from 64 to 128.
> > > > > >>
> > > > > > Yes, but the migration from higher kernel to lower one looks like an immortal problem.
> > > > >
> > > > > This is not rare. For example, the patch for 128 is applied on the
> > > > > source but not destination.
> > > >
> > > > I agree that such migration can happen and _theoretically_ may cause
> > > > some misunderstanding
> > > > I'm almost sure that practically this can't cause any.misunderstanding
> > > > and we can discuss this specific problem in depth.
> > > > (I just need some time for some tests)
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >>
> > > > > >> > The same on Windows - the adapter does not have such a capability as maximal
> > > > > >> > number of UDP segments, so the upper layers are unaware of any limitation.
> > > > > >>
> > > > > >> Right, that might be problematic.
> > > > > >>
> > > > > >> The checking of UDP_MAX_SEGMENTS implies an agreement of guest and
> > > > > >> host.  But such implications are not true  .
> > > > > >>
> > > > > >> Thanks
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> > >
> > > > > >> > > Thanks
> > > > > >> > >
> > > > > >> > > >
> > > > > >> > > > Signed-off-by: Yuri Benditovich <yuri.benditovich@...nix.com>
> > > > > >> > > > ---
> > > > > >> > > >  include/linux/udp.h                  | 2 +-
> > > > > >> > > >  tools/testing/selftests/net/udpgso.c | 2 +-
> > > > > >> > > >  2 files changed, 2 insertions(+), 2 deletions(-)
> > > > > >> > > >
> > > > > >> > > > diff --git a/include/linux/udp.h b/include/linux/udp.h
> > > > > >> > > > index 3748e82b627b..7e75ccdf25fe 100644
> > > > > >> > > > --- a/include/linux/udp.h
> > > > > >> > > > +++ b/include/linux/udp.h
> > > > > >> > > > @@ -108,7 +108,7 @@ struct udp_sock {
> > > > > >> > > >  #define udp_assign_bit(nr, sk, val)            \
> > > > > >> > > >         assign_bit(UDP_FLAGS_##nr, &udp_sk(sk)->udp_flags, val)
> > > > > >> > > >
> > > > > >> > > > -#define UDP_MAX_SEGMENTS       (1 << 6UL)
> > > > > >> > > > +#define UDP_MAX_SEGMENTS       (1 << 7UL)
> > > > > >> > > >
> > > > > >> > > >  #define udp_sk(ptr) container_of_const(ptr, struct udp_sock, inet.sk)
> > > > > >> > > >
> > > > > >> > > > diff --git a/tools/testing/selftests/net/udpgso.c b/tools/testing/selftests/net/udpgso.c
> > > > > >> > > > index 1d975bf52af3..85b3baa3f7f3 100644
> > > > > >> > > > --- a/tools/testing/selftests/net/udpgso.c
> > > > > >> > > > +++ b/tools/testing/selftests/net/udpgso.c
> > > > > >> > > > @@ -34,7 +34,7 @@
> > > > > >> > > >  #endif
> > > > > >> > > >
> > > > > >> > > >  #ifndef UDP_MAX_SEGMENTS
> > > > > >> > > > -#define UDP_MAX_SEGMENTS       (1 << 6UL)
> > > > > >> > > > +#define UDP_MAX_SEGMENTS       (1 << 7UL)
> > > > > >> > > >  #endif
> > > > > >> > > >
> > > > > >> > > >  #define CONST_MTU_TEST 1500
> > > > > >> > > > --
> > > > > >> > > > 2.34.3
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > >
> > > >
> > >
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ