lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 13 Jul 2016 13:48:54 -0400
From:	Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Network Development <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	Craig Gallek <kraig@...gle.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH net] sock_diag: invert socket destroy broadcast check

On Fri, Jun 24, 2016 at 6:22 PM, Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
> On Fri, Jun 24, 2016 at 4:41 PM, Eric W. Biederman
> <ebiederm@...ssion.com> wrote:
>> Willem de Bruijn <willemdebruijn.kernel@...il.com> writes:
>>
>>> From: Willem de Bruijn <willemb@...gle.com>
>>>
>>> Socket destruction is only broadcast for a socket sk if a diag
>>> listener is registered and sk is not a kernel socket.
>>>
>>> Invert the test to not even check for listeners for kernel sockets.
>>>
>>> The sock_diag_has_destroy_listeners invocation dereferences
>>> sock_net(sk), which for kernel sockets can be invalid as they do not
>>> take a reference on the network namespace.
>>
>> No.  That isn't so.  A kernel socket for a network namespace must be
>> destroyed in the network namespace teardown.

I spent some more time looking at this.

inet_ctl_sock_destroy does not destroy the socket if there are still
skbuff with a reference on it (or its sk_wmem_alloc). Skbs are
orphaned when they leave the namespace through dev_forward_skb, but
not when sent out a physical nic (correctly, that would break TSQ).

The bug happened with macvlan on top of bonding on top of a physical
nic. The macvlan lives in a temporary namespace. After the macvlan and
network namespace are destroyed, the physical device has a TCP RST skb
from net.ipv4->tcp_sk queued for tx completion.

I have not able to reproduce this exact scenario, likely because tx
completion handling is on the order of microseconds and not easily
slowed sufficiently for testing. Using a tap device with skb_orphan
commented out, I can cause the issue. Commenting out skb_orrphan is
clearly a gross hack. The point I wanted to verify is that underlying
device is not stopped --and its queues cleaned of skb-- when the
macvlan device is destroyed.

Network namespace teardown is complex. Am I missing a step that does
prevents the above, or does this indeed sound feasible in principle
(if very unlikely in practice)?

Powered by blists - more mailing lists