[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba243807-02b3-a59b-8e7b-c397e28dfdda@kernel.org>
Date: Wed, 31 Aug 2022 20:29:29 -0600
From: David Ahern <dsahern@...nel.org>
To: Richard Gobert <richardbgobert@...il.com>, davem@...emloft.net,
yoshfuji@...ux-ipv6.org, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, shuah@...nel.org, netdev@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v2] net-next: Fix IP_UNICAST_IF option behavior for
connected sockets
On 8/29/22 5:18 AM, Richard Gobert wrote:
> The IP_UNICAST_IF socket option is used to set the outgoing interface
> for outbound packets.
>
> The IP_UNICAST_IF socket option was added as it was needed by the
> Wine project, since no other existing option (SO_BINDTODEVICE socket
> option, IP_PKTINFO socket option or the bind function) provided the
> needed characteristics needed by the IP_UNICAST_IF socket option. [1]
> The IP_UNICAST_IF socket option works well for unconnected sockets,
> that is, the interface specified by the IP_UNICAST_IF socket option
> is taken into consideration in the route lookup process when a packet
> is being sent. However, for connected sockets, the outbound interface
> is chosen when connecting the socket, and in the route lookup process
> which is done when a packet is being sent, the interface specified by
> the IP_UNICAST_IF socket option is being ignored.
>
> This inconsistent behavior was reported and discussed in an issue
> opened on systemd's GitHub project [2]. Also, a bug report was
> submitted in the kernel's bugzilla [3].
>
> To understand the problem in more detail, we can look at what happens
> for UDP packets over IPv4 (The same analysis was done separately in
> the referenced systemd issue).
> When a UDP packet is sent the udp_sendmsg function gets called and
> the following happens:
>
> 1. The oif member of the struct ipcm_cookie ipc (which stores the
> output interface of the packet) is initialized by the ipcm_init_sk
> function to inet->sk.sk_bound_dev_if (the device set by the
> SO_BINDTODEVICE socket option).
>
> 2. If the IP_PKTINFO socket option was set, the oif member gets
> overridden by the call to the ip_cmsg_send function.
>
> 3. If no output interface was selected yet, the interface specified
> by the IP_UNICAST_IF socket option is used.
>
> 4. If the socket is connected and no destination address is
> specified in the send function, the struct ipcm_cookie ipc is not
> taken into consideration and the cached route, that was calculated in
> the connect function is being used.
>
> Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken
> into consideration.
>
> This patch corrects the behavior of the IP_UNICAST_IF socket option
> for connect()ed sockets by taking into consideration the
> IP_UNICAST_IF sockopt when connecting the socket.
>
> In order to avoid reconnecting the socket, this option is still
> ignored when applied on an already connected socket until connect()
> is called again by the Richard Gobert.
>
> Change the __ip4_datagram_connect function, which is called during
> socket connection, to take into consideration the interface set by
> the IP_UNICAST_IF socket option, in a similar way to what is done in
> the udp_sendmsg function.
>
> [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/
> [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255
>
> Signed-off-by: Richard Gobert <richardbgobert@...il.com>
> ---
> v1 -> v2: Added self-tests and targeted to net-next.
>
> net/ipv4/datagram.c | 2 ++
> tools/testing/selftests/net/fcnal-test.sh | 30 +++++++++++++++++++++++
> tools/testing/selftests/net/nettest.c | 16 ++++++++++--
> 3 files changed, 46 insertions(+), 2 deletions(-)
>
Reviewed-by: David Ahern <dsahern@...nel.org>
> diff --git a/tools/testing/selftests/net/fcnal-test.sh b/tools/testing/selftests/net/fcnal-test.sh
> index 03b586760164..31c3b6ebd388 100755
> --- a/tools/testing/selftests/net/fcnal-test.sh
> +++ b/tools/testing/selftests/net/fcnal-test.sh
> @@ -1466,6 +1466,13 @@ ipv4_udp_novrf()
> run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP}
> log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF"
>
> + log_start
> + run_cmd_nsb nettest -D -s &
> + sleep 1
> + run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -0 ${NSA_IP} -U
> + log_test_addr ${a} $? 0 "Client, device bind via IP_UNICAST_IF, with connect()"
> +
> +
> log_start
> show_hint "Should fail 'Connection refused'"
> run_cmd nettest -D -r ${a}
> @@ -1525,6 +1532,13 @@ ipv4_udp_novrf()
> run_cmd nettest -D -d ${NSA_DEV} -S -r ${a}
> log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection"
>
> + log_start
> + run_cmd nettest -s -D &
> + sleep 1
> + run_cmd nettest -D -d ${NSA_DEV} -S -r ${a} -U
> + log_test_addr ${a} $? 0 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
> +
> +
> # IPv4 with device bind has really weird behavior - it overrides the
> # fib lookup, generates an rtable and tries to send the packet. This
> # causes failures for local traffic at different places
> @@ -1550,6 +1564,15 @@ ipv4_udp_novrf()
> sleep 1
> run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S
> log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
> +
> + log_start
> + show_hint "Should fail since addresses on loopback are out of device scope"
> + run_cmd nettest -D -s &
> + sleep 1
> + run_cmd nettest -D -r ${a} -d ${NSA_DEV} -S -U
> + log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
> +
> +
> done
>
> a=${NSA_IP}
> @@ -3157,6 +3180,13 @@ ipv6_udp_novrf()
> sleep 1
> run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S
> log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection"
> +
> + log_start
> + show_hint "Should fail 'No route to host' since addresses on loopback are out of device scope"
> + run_cmd nettest -6 -D -s &
> + sleep 1
> + run_cmd nettest -6 -D -r ${a} -d ${NSA_DEV} -S -U
> + log_test_addr ${a} $? 1 "Global server, device client via IP_UNICAST_IF, local connection, with connect()"
> done
>
> a=${NSA_IP6}
All of the tests are 'novrf' but this should work with VRF too. The
device is in a VRF, so binding it to should. Please add these tests
again for the VRF cases. It can be a followup patch.
Powered by blists - more mailing lists