[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpW4im2sk3-SXh0fgA9fNp1rN5=+UrwP2wATd=hh1CVo7Q@mail.gmail.com>
Date: Mon, 31 Jul 2017 15:01:07 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Rolf Neugebauer <rolf.neugebauer@...ker.com>
Cc: David Ahern <dsahern@...il.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Long stalls creating a new netns after a netns with a SMB client exits
On Mon, Jul 31, 2017 at 9:22 AM, Rolf Neugebauer
<rolf.neugebauer@...ker.com> wrote:
> On Fri, Jul 28, 2017 at 8:16 PM, David Ahern <dsahern@...il.com> wrote:
>> On 7/28/17 12:58 PM, Rolf Neugebauer wrote:
>>>>> I can readily reproduce this on 4.9.39, 4.11.12 and another user
>>>>> repro-ed it on 4.12.3. It seems to happen every time. At least one
>>>>> user reported issues with NFS mounts as well, but we were not able to
>>>>> reproduce it. It's not clear to me if this is directly related to
>>>>> 'mount.cifs' or if that just happens to reliably repro it.
>>>>
>>>> OK, so commit d747a7a51b00984127a88113c does not help this case
>>>> either.
>>>
>>> d747a7a51b009("tcp: reset sk_rx_dst in tcp_disconnect()") indeed seems
>>> a different issue. As I understand that actually caused the ref count
>>> never to get decremented, while here eventually some cleanup kicks in
>>> after a long timeout.
>>
>> It could be a dst is cached on a socket and does not get cleared until
>> the socket time outs are done.
>>
>> Test that theory by something like this for IPv4 TCP (similar change for
>> UDP if the client is UDP based):
>>
>> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
>> index 3a19ea28339f..37db087b6c97 100644
>> --- a/net/ipv4/tcp_ipv4.c
>> +++ b/net/ipv4/tcp_ipv4.c
>> @@ -1855,7 +1855,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const
>> struct sk_buff *skb)
>> {
>> struct dst_entry *dst = skb_dst(skb);
>>
>> - if (dst && dst_hold_safe(dst)) {
>> + if (0 && dst && dst_hold_safe(dst)) {
>> sk->sk_rx_dst = dst;
>> inet_sk(sk)->rx_dst_ifindex = skb->skb_iif;
>> }
>
>
> This removes the 200s stall (the test is IPv4/TCP based)
Interesting. This means we have a kernel socket which holds
the dst refcnt.
Looking at the cifs code, it does create a TCP kernel socket
which doesn't hold refcnt to netns but its sk_rx_dst could
still be set as usual, therefore this socket could hold the dst
which holds lo device after the netns is gone. But its timeout
seems to be 60sec (SMB_ECHO_INTERVAL_DEFAULT),
not 200sec.
Ideally it should use a per netns socket so that it would have
a same life-time with netns. But you need to check this with
cifs developers, I don't understand cifs at all.
Powered by blists - more mailing lists