lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 11 Jan 2019 15:29:54 -0800
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     netdev@...r.kernel.org
Subject: Fw: [Bug 202235] New: regression: physical to VETH (LXC) network
 bridge after updating to 4.20.0

This looks like it is related to some of the recent discussion in netdev
around skb->tstamp (fq) or neighbor cache

Begin forwarded message:

Date: Fri, 11 Jan 2019 22:58:45 +0000
From: bugzilla-daemon@...zilla.kernel.org
To: stephen@...workplumber.org
Subject: [Bug 202235] New: regression: physical to VETH (LXC) network bridge after updating to 4.20.0


https://bugzilla.kernel.org/show_bug.cgi?id=202235

            Bug ID: 202235
           Summary: regression: physical to VETH (LXC) network bridge
                    after updating to 4.20.0
           Product: Networking
           Version: 2.5
    Kernel Version: 4.20.0
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Other
          Assignee: stephen@...workplumber.org
          Reporter: mjevans1983@...il.com
        Regression: No

I have since reverted to the working LTS kernel image offered by Arch Linux
(4.19.13), but am willing to re-test / gather data additional data on a couple
lower-use time periods during the week.  

After updating to Linux 4.20.0 (along with a full system update otherwise) my
BRIDGED network connections to some LXC containers ceased working.  

Attempting to troubleshoot this issue also produced extremely odd results,
which I think offhand MIGHT have caused network packets to fill up some kind of
memory buffer instead of being relaid or dropped; there are some additional
details at the serverfault and LXC bugs that I filed, as it was initially (and
still is) unclear where the actual issue is.  

-  

At this time I am unsure if it is related to netdev (bridge, veth), cgroups, or
some changed default that should now be configured in a way that is different
to previous defaults.  

https://serverfault.com/questions/947848/linux-bridge-broken-after-upgrade-out-of-ideas-places-to-look-now-4-20-0-arc 

https://github.com/lxc/lxc/issues/2769  

* It is NOT related to IP forwarding, as this is a BRIDGED connection, not a
routed one, and it works on older kernels without that enabled.  

* physical network to bridge works (and will stay connected for a few min after
later troubleshooting steps, even if ARP caches / ping flake out and stop
responding)  

* VETH (within LXC) can ping the the host IP on the bridge (but not the
gateway, the host can before this step) if manually assigned a static address. 
Doing this seems to cause general instability and a timed out SSH session. 
This lead me to rebooting between each round of testing to ensure I had a clean
slate to start with.  

I went over the major settings that I did check in the other two bug reports,
but I'm open to checking other values and/or performing different kinds of
tests occasionally over a given week.  Responses won't be immediate but I'll
try to check on this frequently over the next two weeks.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Powered by blists - more mailing lists