lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPZB33C-C1t1z7Dk@shredder>
Date: Mon, 20 Oct 2025 17:06:23 +0300
From: Ido Schimmel <idosch@...sch.org>
To: g.goller@...xmox.com
Cc: davem@...emloft.net, dsahern@...nel.org, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: Wrong source address selection in arp_solicit for forwarded
 packets

On Fri, Oct 17, 2025 at 04:47:27PM +0200, Gabriel Goller wrote:
> Hi,
> I have a question about the arp solicit behavior:
> 
> I have the following simple infrastructure with linux hosts where the ip
> addresses are configured on dummy interfaces and all other interfaces are
> unnumbered:
> 
>   ┌────────┐     ┌────────┐     ┌────────┐    │ node1  ├─────┤ node2
> ├─────┤ node3  │    │10.0.1.1│     │10.0.1.2│     │10.0.1.3│    └────────┘
> └────────┘     └────────┘

The diagram looks mangled. At least I don't understand it.

> 
> All nodes have routes configured and can ping each other. ipv4 forwarding is
> enabled on all nodes, so pinging from node1 to node3 should work. However, I'm
> encountering an issue where node2 does not send correct arp solicitation
> packets when forwarding icmp packets from node1 to node3.

I believe ICMP is irrelevant here.

> 
> For example, when pinging from node1 to node3, node2 sends out the
> following arp packet:
> 
> 13:57:43.198959 bc:24:11:a4:f6:cd > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100),
> length 46: vlan 300, p 0, ethertype ARP (0x0806), Ethernet (len 6),
> IPv4 (len 4), Request who-has 10.0.1.3 tell 172.16.0.102, length 28
> 
> Here, 172.16.0.102 is an ip address configured on a different interface on
> node2. This request will never receive a response because `rp_filter=2`.
> 
> node2 has the following (correct) routes installed:
> 
> 10.0.1.3 nhid 18 via 10.0.1.3 dev ens22 proto openfabric src 10.0.1.2 metric 20 onlink
> 
> Since arp_announce is set to 0 (the default), arp_solicit selects the first
> interface with an ip address (inet_select_addr), which results in
> selecting the wrong source address (172.16.0.102) for the arp request.
> Because rp_filter is set to 2, we won't receive an answer to this arp
> packet, and the ping will fail unless we explicitly ping from node2 to
> node3.
> 
> I'm wondering if it would be possible (and correct) to modify arp_solicit to
> perform a fib lookup to check if there's a route with an explicit source
> address (e.g., the route above using src 10.0.1.2) and use that address as the
> source address for the arp packet. Of course, this wouldn't be backward
> compatible, as some users might rely on the current interface ordering behavior
> (or the loopback interface being selected first), so it would need to be
> controlled via a sysctl configuration flag. Perhaps I'm missing something
> obvious here though.

This would probably entail adding a new arp_announce level, but nobody
added a new level in at least 20 years, so you will need to explain why
your setup is special and why the same functionality cannot be achieved
in a different way that does not require kernel changes.

A few things you can consider:

1. You wrote that the router interfaces are unnumbered. Modern
unnumbered networks usually assign IPv6 link-local addresses to these
interfaces. These addresses are only used for neighbour resolution and
can be used as the nexthop address for IPv4 routes. For example:

ip route add 192.0.2.1/32 nexthop via inet6 fe80::1 dev dummy1

Or using nexthop objects:

ip nexthop add id 1 via fe80::1 dev dummy1
ip route add 192.0.2.1/32 nhid 1

2. If you have interfaces whose addresses should not be considered as
source addresses when generating IP/ARP packets out of other interfaces,
then you can try placing them in a different VRF if it's viable.

3. Requires some work and I didn't look too much into it, but I believe
it should be possible to derive the preferred source address and rewrite
it in ARP packets using tc-bpf on egress. See:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dab4e1f06cabb6834de14264394ccab197007302

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ