lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 23 May 2023 16:01:38 -0700
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: Moviuro <moviuro@...il.com>
cc: netdev@...r.kernel.org
Subject: Re: Secondary bond slave receiving packets when preferred is up

Moviuro <moviuro@...il.com> wrote:

>Hi there,
>
>On 2 similar machines, some (random?) packets are received on a wireless
>bond slave when the preferred eth interface is connected: this causes
>local packet loss and at worst, disconnects (e.g. SSH and KDEConnect).
>
>My setup looks fine, inspired by the Arch wiki[0], see
>/proc/net/bonding/bond0 below. The archlinux community has not been able
>to help so far[1].
>
>     +-----------+                
>     |Router .1  |                
>     +-----+-----+                
>           |                      
>     +-----+-----+                
>     |Switch .30 +---------------+
>     +--+--------+-------------+ |
>        |                      | |
>        |                      | |
> +------+--+    +-----------+  | |
> | WAP .21 +~~~~+Client .111+--+ |
> +------+--+    +-----------+    |
>        |                        |
>        |       +-----------+    |
>        +~~~~~~~+Client .149-----+
>                +-----------+     
>
>Running ping(8) for a few hours, there's nothing much going on, packet
>loss is really because ICMP packets end up on the WiFi interface:
>
>* .1 -> .149: 56436 sent, 56405 replies
>* .1 -> .111: 20643 sent, 20640 replies
>* .111 -> .149: 7682/7702 packets
>* .149 -> .111: 14791/14792 packets
>
>Sure enough, there's some noise on the WiFi interface:
>
>root@149 # tcpdump -ttttnei wlp3s0 host 192.168.1.149 and not arp
>2023-05-23 09:29:46.771535 11:11:11:11:11:74 > BB:BB:BB:BB:BB:33, ethertype IPv4 (0x0800), length 98: 192.168.1.1 > 192.168.1.149: ICMP echo request , id 64306, seq 53425, length 64
>2023-05-23 09:36:04.710859 bb:bb:bb:bb:bb:32 > BB:BB:BB:BB:BB:33, ethertype IPv4 (0x0800), length 98: 192.168.1.111 > 192.168.1.149: ICMP echo reque st, id 1, seq 2390, length 64

	Some amount of random traffic arriving on the inactive interface
of an active-backup bond is expected; switches send traffic to such
places for various reasons.  My initial guess would be that the switch's
forwarding entry for whatever BB:BB:BB:BB:BB:33 is expired, and the
switch flooded traffic for that destination to all ports.  As an aside,
what is that MAC address?  The last octet (33) doesn't appear in any of
the bond info dumps you list later for the .149 host.

	In any event, an inactive bond interface will pass incoming
traffic in two cases:

	1) its destination MAC address is in the link local reserved
range, 01:80:c2:00:00:0?, which is used for things like Spanning Tree or
LACP; the complete list can be found at

https://standards.ieee.org/products-programs/regauth/grpmac/public/

	These should not be ARP or IP, and this is unlikely to be your
situation.

	2) Something is bound directly to the bond interface itself via
a raw socket or the like; an example of this is LLDP, which needs to
exchange protocol frames at the interface level.

	Even if the bond accepted some IP traffic on the inactive
interface and sent it up the stack, any reply should go back out the
active interface.  This is based on the lack of failovers in the bond
status stuff, and presuming that the routing table on .111 and .149 is
what I'd expect (basically, a default route and subnet route for
192.168.1.0/24 that go through the bond only).

	Some suggestions that might help:

	1) Check rp_filter; if it's not enabled, then turn it on in
strict mode.  This means insuring that the sysctls for .all, the bond
and its interfaces are all set to 1, e.g.,

net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.bond0.rp_filter = 1
net.ipv4.conf.wlp5s0.rp_filter = 1
[... and so on ...]

	Setting any of them to 2 will enable loose mode (the maximum
value between .all and the interface is what counts).  Loose mode, or
rp_filter being off entirely, might be your problem if your routing is
not simple (e.g., you've got other IP networks that you didn't
describe).  The docs for this can be found at

https://docs.kernel.org/networking/ip-sysctl.html

	2) Enable the bonding option fail_over_mac = follow, this will
cause the MAC of the bond interfaces to not be all set to the same MAC.
If somehow the switch is getting confused by seeing the same MAC from
multiple ports, this may help.

	-J

---
	-Jay Vosburgh, jay.vosburgh@...onical.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ