lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <06798029-660D-454E-8628-3A9B9E1AF6F8@safebits.tech> Date: Sat, 7 Oct 2023 14:39:31 +0300 From: Luci Stanescu <luci@...ebits.tech> To: "David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org> Cc: netdev@...r.kernel.org Subject: IPv6 recvmsg() wrong scope for source address when using VRFs Hi, I've discovered that the wrong sin6_scope_id is filled in by recvmsg() in msg_name when using VRFs. Specifically, the scope contains the index of the VRF interface, instead of the slave on which the packet was received. This scope is unfortunately useless if link-local addressing is used. The context in which I discovered this issue is using non-local communication with UDP sockets and multicast (specifically having a DHCPv6 server on an interface enslaved to a VRF), but I believe the issue may be applicable to other transports and it certainly applies to unicast, which I've used to reproduce the issue in a simpler way. Here's how to reproduce. I'm going to exemplify using Python and local communication with veth devices for brevity. I'm using Ubuntu 22.04 LTS, with kernel 6.2.0-34, but I've tracked this down in the source code in the master branch (further down), so please bear with me. I'm going to call my VRF interface "myvrf". I'm going to create a veth pair and enslave one end to the VRF. ip link add myvrf type vrf table 42 ip link set myvrf up ip link add veth1 type veth peer name veth2 ip link set veth1 master myvrf up ip link set veth2 up # ip link sh dev myvrf 110: myvrf: <NOARP,MASTER,UP,LOWER_UP> mtu 65575 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether da:ca:c9:2b:6e:02 brd ff:ff:ff:ff:ff:ff # ip addr sh dev veth1 112: veth1@...h2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master myvrf state UP group default qlen 1000 link/ether 32:63:cf:f5:08:35 brd ff:ff:ff:ff:ff:ff inet6 fe80::3063:cfff:fef5:835/64 scope link valid_lft forever preferred_lft forever # ip addr sh dev veth2 111: veth2@...h1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 1a:8f:5a:85:3c:c0 brd ff:ff:ff:ff:ff:ff inet6 fe80::188f:5aff:fe85:3cc0/64 scope link valid_lft forever preferred_lft forever The receiver: import socket import struct s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_UDP) s.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_RECVPKTINFO, 1) s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, b'myvrf') s.bind(('', 2000, 0, 0)) while True: data, cmsg_list, flags, source = s.recvmsg(4096, 4096) for level, type, cmsg_data in cmsg_list: if level == socket.IPPROTO_IPV6 and type == socket.IPV6_PKTINFO: source_address, source_scope = struct.unpack('@...I', cmsg_data) source_address = socket.inet_ntop(socket.AF_INET6, source_address) print("PKTINFO destination {} {}".format(source_address, source_scope)) source_address, source_port, source_flow, source_scope = source print("name source {} {}".format(source_address, source_scope)) The same thing happens, as expected, if sysctl net.ipv4.udp_l3mdev_accept is set to 1 and the receiver doesn't bind the socket to the VRF master device. The sender is going to use the link-local address of veth1 to address the packet on veth2 (scope 111): import socket s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_UDP) dest = ('fe80::3063:cfff:fef5:835', 2000, 0, 111) s.sendto(b'foo', dest) Please note that the destination address is the veth1 link-local address and the scope is the veth2 interface index. The receiver will print this: PKTINFO destination fe80::3063:cfff:fef5:835 112 name source fe80::188f:5aff:fe85:3cc0 110 Please note that the scope of the destination (from IPV6_PKTINFO) is, correctly, the interface index of the receiving interface, veth1. However, the scope of the source in the msg_name is the interface index of the VRF master device. Unfortunately, for link-local addressing, the scope of the VRF master device is useless. In my original problem, a DHCPv6 server wouldn't be able to send a response packet to the link-local address. While an application could certainly use IPV6_PKTINFO to work around this problem, I believe it feels like a bit of a hack. I've tracked this down in the source code to the following (please bear with my explanations, as I've not really familiar with the code): First, in 2014, the scope of was changed from IP6CB(skb)->iif to inet6_iif(skb) in commit https://github.com/torvalds/linux/commit/4330487acfff0cf1d7b14d238583a182e0a444bb. At the time, that function from include/linux/ipv6.h simply returned P6CB(skb)->iif, so that was a bit of a NOOP. Then, in 2016, inet6_iif was changed to return the VRF master if P6CB(skb)->iif was enslaved to a VRF in this commit: https://github.com/torvalds/linux/commit/74b20582ac389ee9f18a6fcc0eef244658ce8de0. Now, that also made sense because at the time you couldn't connect() or sendmsg() over a VRF by specifying a VRF slave interface index as a destination, you had to specify the VRF master interface index in the scope. Using link-local addresses of VRF enslaved devices at this point in time would've been impossible anyway. But then, in 2018, a series of patches allowed things like connect() and sendmsg() to specify the index of a VRF slave interface, thus allowing link-local addresses to be used. For example: https://github.com/torvalds/linux/commit/54dc3e3324829d346c959ff774626d9c6c9a65b5 https://github.com/torvalds/linux/commit/6da5b0f027a825df2aebc1927a27bda185dc03d4 I do not know enough about the code to understand whether after those patches in 2018 inet6_iif() could be changed to return the VRF slave device instead of the master or whether recvmsg() should not longer use inet6_iif(), but I do believe the scope returned by recvmsg() is a bug. Thank you for your time! -- Luci Stanescu Content of type "text/html" skipped Download attachment "smime.p7s" of type "application/pkcs7-signature" (3602 bytes)
Powered by blists - more mailing lists