lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZzuuSDQZux8uof5O@Z926fQmE5jqhFMgp6>
Date: Mon, 18 Nov 2024 22:14:48 +0100
From: Etienne Buira <etienne.buira@...e.fr>
To: davem@...emloft.net, dsahern@...nel.org, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [Bug] Linux sends poisonous ARP replies on ethernet

Hi all,

I found problematic behaviours in linux network stack, which all look
so related i make a single bug-report.

The underlying bug(s) might be ancient (i previously had strange
behaviours that could be caused by this), therefore fixes should
probably find their way to stable@...r.kernel.org.

The configuration:
Two boxes are present on a dedicated virtual hub, both run
linux-torvalds b5a24181e461e8bfa8cdf35e1804679dc1bebcdd configured with
attached linux.config file and untainted, under qemu (Gentoo's version
8.2.3) as kvm guests using virtio NICs.

box1:
  # ip a
	1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
	    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
	    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
	       valid_lft forever preferred_lft forever
	2: nic0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
	    link/ether 00:16:3e:00:00:04 brd ff:ff:ff:ff:ff:ff
	    inet 10.0.1.1/24 brd 10.0.1.255 scope global nic0
	       valid_lft forever preferred_lft forever
	3: nic1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
	    link/ether 00:16:3e:00:00:01 brd ff:ff:ff:ff:ff:ff
	    inet 10.0.0.1/24 brd 10.0.0.255 scope global nic1
	       valid_lft forever preferred_lft forever
	4: nic2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
	    link/ether 00:16:3e:00:00:02 brd ff:ff:ff:ff:ff:ff
	    inet 10.0.0.2/24 brd 10.0.0.255 scope global nic2
	       valid_lft forever preferred_lft forever

box2:
  # ip a
	1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
	    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
	    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
	       valid_lft forever preferred_lft forever
	2: nic3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
	    link/ether 00:16:3e:00:00:03 brd ff:ff:ff:ff:ff:ff
	    inet 10.0.0.3/24 brd 10.0.0.255 scope global nic3
	       valid_lft forever preferred_lft forever



The problem i found is that linux replies to ARP requests not directed
at it, this behaviour is not complying with RFC826:
box2# arping -c 5 10.0.0.2 -I nic3
	ARPING 10.0.0.2 from 10.0.0.3 nic3
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:04] 2.285ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:01] 2.357ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:02] 2.368ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:02] 0.598ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:02] 0.771ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:02] 0.600ms
	Unicast reply from 10.0.0.2 [00:16:3e:00:00:02] 0.627ms
	Sent 5 probe(s) (0 broadcast(s))
	Received 7 response(s) (0 request(s), 0 broadcast(s))

Here, all box1's NICs have replied with their own ethernet address(!),
including 00:16:3e:00:00:04 which is not even on the same IP network.

Using a breakpoint on arp_send_dst, i could confirm arp queries are
replied for each NIC:
	Thread 2 hit Breakpoint 6.1, arp_send_dst (type=2, ptype=2054, dest_ip=50331658, dev=0xffff888003350000, src_ip=33554442, dest_hw=0xffff88800373e456 "", src_hw=0xffff88800306c3a8 "", target_hw=0xffff88800373e456 "", dst=0x0 <fixed_percpu_data>)
	    at /ssdtmp/linux/linux-torvalds/net/ipv4/arp.c:307
	307     {

	(gdb) bt

	#0  arp_send_dst (type=2, ptype=2054, dest_ip=50331658, dev=0xffff888003350000, src_ip=33554442, dest_hw=0xffff88800373e456 "", src_hw=0xffff88800306c3a8 "", target_hw=0xffff88800373e456 "", dst=0x0 <fixed_percpu_data>) at /ssdtmp/linux/linux-torvalds/net/ipv4/arp.c:307
	#1  0xffffffff816104b9 in arp_process (net=0xffffffff82347940 <init_net>, sk=<optimized out>, skb=0xffff888003eac900) at /ssdtmp/linux/linux-torvalds/net/ipv4/arp.c:852
	#2  0xffffffff81541a21 in __netif_receive_skb_list_ptype (orig_dev=0xffff888003350000, pt_prev=0xffffffff81ede240 <arp_packet_type>, head=0xffffc900000d0cb8) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:5718
	#3  __netif_receive_skb_list_core (head=head@...ry=0xffff888003378910, pfmemalloc=pfmemalloc@...ry=false) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:5760
	#4  0xffffffff81542088 in __netif_receive_skb_list (head=0xffff888003378910) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:5812
	#5  netif_receive_skb_list_internal (head=head@...ry=0xffff888003378910) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:5903
	#6  0xffffffff815427c9 in gro_normal_list (napi=0xffff888003378808) at /ssdtmp/linux/linux-torvalds/include/net/gro.h:515
	#7  napi_complete_done (n=n@...ry=0xffff888003378808, work_done=work_done@...ry=1) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:6254
	#8  0xffffffff814c7e44 in virtqueue_napi_complete (processed=1, vq=0xffff888003059e00, napi=0xffff888003378808) at /ssdtmp/linux/linux-torvalds/drivers/net/virtio_net.c:717
	#9  virtnet_poll (napi=0xffff888003378808, budget=<optimized out>) at /ssdtmp/linux/linux-torvalds/drivers/net/virtio_net.c:2851
	#10 0xffffffff815429e6 in __napi_poll (n=n@...ry=0xffff888003378808, repoll=repoll@...ry=0xffffc900000d0ebf) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:6779
	#11 0xffffffff81542f7c in napi_poll (repoll=0xffffc900000d0ed8, n=0xffff888003378808) at /ssdtmp/linux/linux-torvalds/net/core/dev.c:6848
	#12 net_rx_action () at /ssdtmp/linux/linux-torvalds/net/core/dev.c:6970
	#13 0xffffffff8106a113 in handle_softirqs (ksirqd=ksirqd@...ry=false) at /ssdtmp/linux/linux-torvalds/kernel/softirq.c:554
	#14 0xffffffff8106a80a in __do_softirq () at /ssdtmp/linux/linux-torvalds/kernel/softirq.c:588
	#15 invoke_softirq () at /ssdtmp/linux/linux-torvalds/kernel/softirq.c:428
	#16 __irq_exit_rcu () at /ssdtmp/linux/linux-torvalds/kernel/softirq.c:637
	#17 irq_exit_rcu () at /ssdtmp/linux/linux-torvalds/kernel/softirq.c:649
	#18 0xffffffff8167e979 in common_interrupt (regs=0xffffc9000009be38, error_code=<optimized out>) at /ssdtmp/linux/linux-torvalds/arch/x86/kernel/irq.c:278
	Backtrace stopped: Cannot access memory at address 0xffffc900000d1010
	(gdb) p dev->name

	$62 = "nic2", '\000' <repeats 11 times>
	(gdb) c

	Continuing.
	[Switching to Thread 1.1]

	Thread 1 hit Breakpoint 6.1, arp_send_dst (type=2, ptype=2054, dest_ip=50331658, dev=0xffff88800334b000, src_ip=33554442, dest_hw=0xffff888004c22c56 "", src_hw=0xffff888002b88fa8 "", target_hw=0xffff888004c22c56 "", dst=0x0 <fixed_percpu_data>)
	    at /ssdtmp/linux/linux-torvalds/net/ipv4/arp.c:307
	307     {

	(gdb) p dev->name

	$63 = "nic1", '\000' <repeats 11 times>
	(gdb) c

	Continuing.
	[Switching to Thread 1.2]

	Thread 2 hit Breakpoint 6.1, arp_send_dst (type=2, ptype=2054, dest_ip=50331658, dev=0xffff888002876000, src_ip=33554442, dest_hw=0xffff88800373e056 "", src_hw=0xffff888002b88ba8 "", target_hw=0xffff88800373e056 "", dst=0x0 <fixed_percpu_data>)
	    at /ssdtmp/linux/linux-torvalds/net/ipv4/arp.c:307
	307     {

	(gdb) p dev->name

	$64 = "nic0", '\000' <repeats 11 times>


The network stack also looks at fault in that i can TCP connect to 10.0.0.2
(nic2's ip) on a listening port with ethernet frames destination/source as
nic1's in lieu of nic2's.


Unfortunately, i cannot afford to get familiar enough with linux's
network stack to fix it without sponsorship.

Best wishes.


View attachment "linux.config" of type "text/plain" (88249 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ