lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAMEtUuwKd-2YZBF8BtKFaKvgb8MgwTfJKz3KkkzVYkhJPNNXzw@mail.gmail.com>
Date:	Sat, 16 Nov 2013 18:18:35 -0800
From:	Alexei Starovoitov <ast@...mgrid.com>
To:	netdev@...r.kernel.org
Subject: unregister_netdevice: waiting for lo to become free

Hi,

once every 24 hr we're hitting namespace cleanup bug:

[53432.230745] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[53442.456822] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[53452.646927] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[53462.861009] unregister_netdevice: waiting for lo to become free.
Usage count = 2
[53468.423648] INFO: task ip:1444 blocked for more than 120 seconds.
[53468.423650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[53468.423651] ip              D ffff88082fb13280     0  1444   1443 0x00000000
[53468.423653]  ffff8806e0b19dd8 0000000000000002 ffff880754b0aee0
ffff8806e0b19fd8
[53468.423655]  ffff8806e0b19fd8 ffff8806e0b19fd8 ffff880803d8ddc0
ffff880754b0aee0
[53468.423657]  0000000000000002 ffffffff81cbe060 ffffffff81cbe064
ffff880754b0aee0
[53468.423658] Call Trace:
[53468.423663]  [<ffffffff8164a7b9>] schedule+0x29/0x70
[53468.423664]  [<ffffffff8164aace>] schedule_preempt_disabled+0xe/0x10
[53468.423666]  [<ffffffff81648b8f>] __mutex_lock_slowpath+0x11f/0x1e0
[53468.423668]  [<ffffffff8152b6f1>] ? net_alloc_generic+0x21/0x30
[53468.423670]  [<ffffffff8164851a>] mutex_lock+0x2a/0x50
[53468.423671]  [<ffffffff8152be20>] copy_net_ns+0x70/0x110
[53468.423674]  [<ffffffff81073261>] create_new_namespaces+0x101/0x1b0
[53468.423676]  [<ffffffff810734ee>] unshare_nsproxy_namespaces+0x6e/0xb0
[53468.423678]  [<ffffffff81047809>] SyS_unshare+0x189/0x2b0

It's reproducible on 3.10.xx
Not clear whether net-next still has it. May be we just didn't run it
long enough.

We've tried to narrow it down over the last month, but didn't go too far.
It can happen on any of our tests. Most of them do: create namespaces,
veth, bridges, run iperf in namespaces, kill them, disconnect
interfaces and so on.

We tried numerous netns specific stress tests, but they all seem to be ok.
It's not clear what combination is causing wrong refcnt.

We tried to add debugging into dst_ifdown() thinking that dev_hold(),
dev_put() combination is causing it somehow, but amount of logs over
24hr is too much.

Similar bug description have been reported on ubuntu forums few times
without real solution. It's not 6549dd43c043

Inside VM with one virtual cpu it hits every 12hr or so.
On physical machine every 24hr or so.

Any advice on where to look or what to try would be greatly appreciated.

Thanks
Alexei
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ