lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 13 Jul 2016 13:48:54 -0400 From: Willem de Bruijn <willemdebruijn.kernel@...il.com> To: "Eric W. Biederman" <ebiederm@...ssion.com> Cc: Network Development <netdev@...r.kernel.org>, David Miller <davem@...emloft.net>, Craig Gallek <kraig@...gle.com>, Eric Dumazet <eric.dumazet@...il.com>, Willem de Bruijn <willemb@...gle.com> Subject: Re: [PATCH net] sock_diag: invert socket destroy broadcast check On Fri, Jun 24, 2016 at 6:22 PM, Willem de Bruijn <willemdebruijn.kernel@...il.com> wrote: > On Fri, Jun 24, 2016 at 4:41 PM, Eric W. Biederman > <ebiederm@...ssion.com> wrote: >> Willem de Bruijn <willemdebruijn.kernel@...il.com> writes: >> >>> From: Willem de Bruijn <willemb@...gle.com> >>> >>> Socket destruction is only broadcast for a socket sk if a diag >>> listener is registered and sk is not a kernel socket. >>> >>> Invert the test to not even check for listeners for kernel sockets. >>> >>> The sock_diag_has_destroy_listeners invocation dereferences >>> sock_net(sk), which for kernel sockets can be invalid as they do not >>> take a reference on the network namespace. >> >> No. That isn't so. A kernel socket for a network namespace must be >> destroyed in the network namespace teardown. I spent some more time looking at this. inet_ctl_sock_destroy does not destroy the socket if there are still skbuff with a reference on it (or its sk_wmem_alloc). Skbs are orphaned when they leave the namespace through dev_forward_skb, but not when sent out a physical nic (correctly, that would break TSQ). The bug happened with macvlan on top of bonding on top of a physical nic. The macvlan lives in a temporary namespace. After the macvlan and network namespace are destroyed, the physical device has a TCP RST skb from net.ipv4->tcp_sk queued for tx completion. I have not able to reproduce this exact scenario, likely because tx completion handling is on the order of microseconds and not easily slowed sufficiently for testing. Using a tap device with skb_orphan commented out, I can cause the issue. Commenting out skb_orrphan is clearly a gross hack. The point I wanted to verify is that underlying device is not stopped --and its queues cleaned of skb-- when the macvlan device is destroyed. Network namespace teardown is complex. Am I missing a step that does prevents the above, or does this indeed sound feasible in principle (if very unlikely in practice)?
Powered by blists - more mailing lists