netdev - Re: Long stalls creating a new netns after a netns with a SMB client exits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4a56fd38-11f6-4824-bdab-d0f2d46cdf51@gmail.com>
Date:   Fri, 28 Jul 2017 13:16:54 -0600
From:   David Ahern <dsahern@...il.com>
To:     Rolf Neugebauer <rolf.neugebauer@...ker.com>,
        Cong Wang <xiyou.wangcong@...il.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Long stalls creating a new netns after a netns with a SMB client
 exits

On 7/28/17 12:58 PM, Rolf Neugebauer wrote:
>>> I can readily reproduce this on 4.9.39, 4.11.12 and another user
>>> repro-ed it on 4.12.3. It seems to happen every time. At least one
>>> user reported issues with NFS mounts as well, but we were not able to
>>> reproduce it. It's not clear to me if this is directly related to
>>> 'mount.cifs' or if that just happens to reliably repro it.
>>
>> OK, so commit d747a7a51b00984127a88113c does not help this case
>> either.
> 
> d747a7a51b009("tcp: reset sk_rx_dst in tcp_disconnect()") indeed seems
> a different issue. As I understand that actually caused the ref count
> never to get decremented, while here eventually some cleanup kicks in
> after a long timeout.

It could be a dst is cached on a socket and does not get cleared until
the socket time outs are done.

Test that theory by something like this for IPv4 TCP (similar change for
UDP if the client is UDP based):

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 3a19ea28339f..37db087b6c97 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1855,7 +1855,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const
struct sk_buff *skb)
 {
        struct dst_entry *dst = skb_dst(skb);

-       if (dst && dst_hold_safe(dst)) {
+       if (0 && dst && dst_hold_safe(dst)) {
                sk->sk_rx_dst = dst;
                inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
        }

> 
> I'll also try if I can get some traces out of dev_hold()/dev_put().


Attached patch puts tracepoints in dev_hold / dev_put; very useful for
debugging cases like this. Use perf record and perf script.

View attachment "0001-Add-tracepoints-to-dev_hold-and-dev_put.patch" of type "text/plain" (2875 bytes)