lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201210130007.GX22874@principal.rfc2324.org>
Date:   Thu, 10 Dec 2020 14:00:07 +0100
From:   Maximilian Wilhelm <max@...2324.org>
To:     netdev@...r.kernel.org
Subject: Regression in igb / bonding / VLAN between 5.8.10 and 5.9.6?

Dear netdev people,

I updated one of my APU2 boxes yesterday and was confronted with an
interesting problem: With (Debian) Kernel 5.9.6 VLAN interfaces on top
of a bond on top of two I210 NICs are working only one way (outbound)
unless the VLAN interface is in promisc mode.

The setup looks like this

        enp1s0       enp2s0
           \           /
            \         /
	     \       /
               bond0 (LACP L3+4)
             /       \
            /         \
           /           \
        vlan23       vlan42

Traffic leaving the box (ARP, ND, OSPF Hellos, ...) works fine
according to tcpdump on a connected device, but inbound traffic only
seems to reach the system when vlanXX is in promisc mode.  If I do a
tcpdump on vlanXX with --no-promiscuous-mode, I can confirm that there
only are outbound packets and none of the ARP replies etc. sent by the
remote box.  On bond0 as well as on the physical NICs I see the same
behaviour (+ LACP frames on the NICs).

I did some tests to pinpoint the problem:
 * VLAN interfaces on top of the physical NIC work fine
 * LACP seems to work fine, slow/fast don't make a difference
 * Disabling all offloading I could disable didn't make it work
   (especially rxvlan)
 * With (Debian) kernel 5.8.10 it works
 * /proc/net/dev shows no rx errors or drops at all only one TX drop
   on the VLAN interfaces

I couldn't find anything suspicous on the box neither in the logs nor
in  ip -d link  etc. Is this a know bug? If not should I test anything
specific or maybe do a git bisect between the kernel versions?

Kind regards
Max
-- 
"Does it bother me, that people hurt others, because they are to weak to face the truth? Yeah. Sorry 'bout that."
 -- Thirteen, House M.D.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ