[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06425eb5-906a-5805-d293-70d240a1197b@molgen.mpg.de>
Date: Tue, 31 Aug 2021 12:12:13 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Saikrishna Arcot <sarcot@...rosoft.com>,
Mike Manning <mmanning@...tta.att-mail.com>
Cc: netdev@...r.kernel.org, David Ahern <dsahern@...il.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: Change in behavior for bound vs unbound sockets
[cc: +maintainers and commit author and reviewers]
Dear Saikrishna,
Am 31.08.21 um 01:47 schrieb Saikrishna Arcot:
Thank you for bringing this issue, you found working on upgrading the
Linux kernel in SONiC [1], up on the mailing list.
> When upgrading from 4.19.152 to 5.10.40, I noticed a change in
> behavior in how incoming UDP packets are assigned to sockets that are
> bound to an interface and a socket that is not bound to any
> interface. This affects the dhcrelay program in isc-dhcp, when it is
> compiled to use regular UDP sockets and not raw sockets.
>
> For each interface it finds on the system (or is passed in via
> command-line), dhcrelay opens a UDP socket listening on port 67 and
> bound to that interface. Then, at the end, it opens a UDP socket also
> listening on port 67, but not bound to any interface (this socket is
> used for sending, mainly). It expects that for packets that arrived
> on an interface for which a bound socket is opened, it will arrive on
> that bound socket. This was true for 4.19.152, but on 5.10.40,
> packets arrive on the unbound socket only, and never on the bound
> socket. dhcrelay discards any packets that it sees on the unbound
> socket. Because of this, this application breaks.
>
> I made a test application that creates two UDP sockets, binds one of
> them to the loopback interface, and has them both listen on 0.0.0.0
> with some random port. Then, it waits for a message on those two
> sockets, and prints out which socket it received a message on. With
> another application (such as nc) sending some UDP message, I can see
> that on 4.19.152, the test application gets the message on the bound
> socket consistently, whereas on 5.10.40, it gets the message on the
> unbound socket consistently. I have a dev machine running 5.4.0, and
> it gets the message on the unbound socket consistently as well.
It’d be great, if you shared your script.
> I traced it to one commit (6da5b0f027a8 "net: ensure unbound datagram
> socket to be chosen when not in a VRF") that makes sure that when not
> in a VRF, the unbound socket is chosen over the bound socket, if both
> are available. If I revert this commit and two other commits that
> made changes on top of this, I can see that packets get sent to the
> bound socket instead. There's similar commits made for TCP and raw
> sockets as well, as part of that patch series.
Commit 6da5b0f027a8 (net: ensure unbound datagram socket to be chosen
when not in a VRF) was added to Linux 5.0.
> Is the intention of those commits also meant to affect sockets that
> are bound to just regular interfaces (and not only VRFs)? If so,
> since this change breaks a userspace application, is it possible to
> add a config that reverts to the old behavior, where bound sockets
> are preferred over unbound sockets?
If it breaks user space, the old behavior needs to be restored according
to Linux’ no regression policy. Let’s hope, in the future, there is
better testing infrastructure and such issues are noticed earlier.
Kind regards,
Paul
PS:
> --
> Saikrishna Arcot
Saikrishna, if you care, the standard signature delimiter has a trailing
space.
[1]: https://github.com/Azure/sonic-linux-kernel/pull/227/
[2]: https://en.wikipedia.org/wiki/Signature_block#Standard_delimiter
Powered by blists - more mailing lists