lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 7 Jun 2015 20:56:53 -0700
From:	Joe Stringer <joestringer@...ira.com>
To:	Zack Weinberg <zackw@...ix.com>
Cc:	Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: "ip netns create" hangs forever, spamming console with
 "unregister_netdevice: waiting for lo to become free"

On 26 May 2015 at 11:56, Zack Weinberg <zackw@...ix.com> wrote:
> On Tue, May 26, 2015 at 12:21 PM, Zack Weinberg <zackw@...ix.com> wrote:
>> I have an application that makes heavy use of network namespaces,
>> creating and destroying them on the fly during operation.  With 100%
>> reproducibility, the first invocation of "ip netns create" after any
>> "ip netns del" hangs forever in D-state; only rebooting the machine
>> clears the condition.
>
> Following up to myself to say that reproduction is not as simple as
> 'ip netns add test; ip netns del test; ip netns add test2'.  In fact,
> not even bringing the namespace (and all associated interfaces) up and
> then  down again _exactly_ as my production code does it will trigger
> the bug.  It appears to be necessary to push a significant amount of
> data through interfaces attached to the namespace.
>
> Since I had to reset the machine to attempt to create a repro recipe,
> I can no longer perform diagnostics on the hung processes, but I've
> restarted the application and it should reach the problem state again
> in a day or two.

Hi Zack, have you had any further development on this issue?

We've been running into a similar issue during OVS development where
in a fairly complex environment running tests using ovs with tunnels
and conntrack inside docker containers reliably produces very similar
lockdep reports. At face value, it seems like something grabs the net
mutex and doesn't let go. The problem only arises after sending
traffic between containers. I'm trying to reduce the possibilities in
this test case to get better information on when/how it fails.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ