[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <50FF5592.60008@gmail.com>
Date: Wed, 23 Jan 2013 11:14:26 +0800
From: Li Yu <raise.sail@...il.com>
To: Bruce Curtis <brutus@...gle.com>
CC: David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>
Subject: Re: v3 for tcp friends?
于 2013年01月23日 05:08, Bruce Curtis 写道:
> Thanks, Li
>
> Started working on friends again, v4, more soon.
>
>
:)
I found another odd bug in TCP friends v3, the clients
may hang at tcp_sendmsg() -> sk_stream_wait_memory() with or
without my refcnt fix patch.
Below shell script can reproduce this bug:
#! /bin/sh -x
sysctl -w net.ipv4.tcp_rmem="4096 1073741824 1073741824"
sysctl -w net.ipv4.tcp_wmem="4096 1073741824 1073741824"
sysctl -w net.ipv4.tcp_friends=1
msg=64K
buf=256M
pkill -9 netserver
netserver
netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4
sysctl -w net.ipv4.tcp_friends=0
pkill -9 netserver
netserver
netperf -t TCP_STREAM -l 1 -- -m ${msg} -M ${msg} -s ${buf} -S ${buf} -4
##################SCRIPT END###################
netperf kernel stack is (by cat /proc/$netperf_pid/stack)
[<ffffffff812ce939>] sk_stream_wait_memory+0x2d9/0x2f0
[<ffffffff8131460c>] tcp_sendmsg+0xf6c/0x1240
[<ffffffff8133c117>] inet_sendmsg+0xf7/0x110
[<ffffffff812bedfd>] sock_sendmsg+0x7d/0xa0
[<ffffffff812c0e4d>] sys_sendto+0x13d/0x190
[<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
netserver kernel stack is :
[<ffffffff812c46ae>] sk_wait_data+0x8e/0xe0
[<ffffffff81315993>] tcp_recvmsg+0x5c3/0xbe0
[<ffffffff8133aefd>] inet_recvmsg+0xed/0x110
[<ffffffff812becf4>] sock_recvmsg+0x84/0xb0
[<ffffffff812c0fae>] sys_recvfrom+0xee/0x170
[<ffffffff8138a6c2>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
And, "netstat -tnp" give us below results:
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
tcp 0 0 127.0.0.1:35451 127.0.0.1:12865
ESTABLISHED 2316/netperf
tcp6 0 0 127.0.0.1:12865 127.0.0.1:35451
ESTABLISHED 2317/netserver
(It seems that netperf hangs on the control connection of benchmark)
I also try to fix this ...
Thanks
Yu
> On Mon, Jan 21, 2013 at 12:55 AM, Li Yu <raise.sail@...il.com
> <mailto:raise.sail@...il.com>> wrote:
>
> 2013/1/21 Li Yu <raise.sail@...il.com <mailto:raise.sail@...il.com>>
>
> 于 2013年01月21日 15:29, Li Yu 写道:
>
> 于 2012年09月05日 00:58, David Miller 写道:
>
> From: Bruce Curtis <brutus@...gle.com
> <mailto:brutus@...gle.com>>
> Date: Tue, 4 Sep 2012 08:10:23 -0700
>
> Will do, issues addressed, I'll get the patch out
> later today or
> tomorrow at the latest.
>
>
> Thanks a lot Bruce.
> --
> To unsubscribe from this list: send the line
> "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> <mailto:majordomo@...r.kernel.org>
> More majordomo info at
> http://vger.kernel.org/__majordomo-info.html
> <http://vger.kernel.org/majordomo-info.html>
>
>
>
> Hi, Bruce,
>
> I tested the TCP friends, found a bug here:
>
> [ 106.541372] Pid: 1765, comm: client Not tainted
> 3.7.0-rc1+ #25
> [ 106.543121] Call Trace:
> [ 106.543950] [<ffffffff8133d212>]
> inet_sock_destruct+0x102/0x1f0
> [ 106.545687] [<ffffffff812c38ad>] __sk_free+0x1d/0x110
> [ 106.547209] [<ffffffff812c3a1c>] sk_free+0x1c/0x20
> [ 106.548611] [<ffffffff8131680c>] tcp_close+0x6c/0x3f0
> [ 106.549863] [<ffffffff8133caea>] inet_release+0xda/0xf0
> [ 106.551134] [<ffffffff8133ca30>] ? inet_release+0x20/0xf0
> [ 106.552419] [<ffffffff8137f3de>] ? mutex_unlock+0xe/0x10
> [ 106.553658] [<ffffffff812bf948>] sock_release+0x28/0xa0
> [ 106.557366] [<ffffffff812bfd69>] sock_close+0x29/0x30
> [ 106.558831] [<ffffffff81128972>] __fput+0x122/0x210
> [ 106.560541] [<ffffffff81128a6e>] ____fput+0xe/0x10
> [ 106.562006] [<ffffffff8105354e>] task_work_run+0x9e/0xd0
> [ 106.563285] [<ffffffff810027e1>] do_notify_resume+0x61/0x70
> [ 106.564582] [<ffffffff8138a908>] int_signal+0x12/0x17
>
>
> I also backported and tested it on stable kernel
> 3.7.3/RHEL6
> kernel, this bug still exists. It seem that client may close
> listening
> sockets, may we need to add one reference count for listen
> socket
> before send its address to peer?
>
>
> Sorry, I lost an important line of kernel log before above them:
>
> [ 106.539367] IPv4: Attempt to release TCP socket in state 10
> ffff880074abb5c0
>
> BTW: state 10 = TCP_LISTEN
>
>
> It seem this patch works for me.
>
> diff --git a/net/ipv4/inet_connection_sock.c
> b/net/ipv4/inet_connection_sock.c
> index 9641215..a625c02 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -623,8 +623,11 @@ struct sock *inet_csk_clone(struct sock *sk,
> const struct request_sock *req,
> sock_hold(newsk);
> was = xchg(&req->friend->sk_friend, newsk);
> /* If requester already connect()ed, maybe
> sleeping */
> - if (was && !sock_flag(req->friend, SOCK_DEAD))
> - sk->sk_state_change(req->friend);
> + if (was) {
> + if (!sock_flag(req->friend, SOCK_DEAD))
> +
> sk->sk_state_change(req->friend);
> + sock_put(was);
> + }
> }
> newsk->sk_state = TCP_SYN_RECV;
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 5917485..7a63245 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2277,8 +2277,10 @@ struct sk_buff *tcp_make_synack(struct sock
> *sk, struct dst_entry *dst,
> memset(&opts, 0, sizeof(opts));
> /* Only try to make friends if enabled */
> - if (sysctl_tcp_friends)
> + if (sysctl_tcp_friends) {
> + sock_hold(sk);
> skb->friend = sk;
> + }
> #ifdef CONFIG_SYN_COOKIES
> if (unlikely(req->cookie_ts))
>
>
> And, our TCP friends v4? :)
>
> Thanks
>
> Yu
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists