netdev - Re: [PATCH net-next 1/2] net: SOCKWQ_ASYNC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1462220607.5535.273.camel@edumazet-glaptop3.roam.corp.google.com>
Date:	Mon, 02 May 2016 13:23:27 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Jiri Pirko <jiri@...nulli.us>
Cc:	Eric Dumazet <edumazet@...gle.com>,
	"David S . Miller" <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>, eladr@...lanox.com,
	idosch@...lanox.com
Subject: Re: [PATCH net-next 1/2] net: SOCKWQ_ASYNC_NOSPACE optimizations

On Mon, 2016-05-02 at 21:12 +0200, Jiri Pirko wrote:
> Mon, May 02, 2016 at 06:22:18PM CEST, eric.dumazet@...il.com wrote:
> >On Mon, 2016-05-02 at 18:16 +0200, Jiri Pirko wrote:
> >> Mon, Apr 25, 2016 at 07:39:32PM CEST, edumazet@...gle.com wrote:
> >> >SOCKWQ_ASYNC_NOSPACE is tested in sock_wake_async()
> >> >so that a SIGIO signal is sent when needed.
> >> >
> >> >tcp_sendmsg() clears the bit.
> >> >tcp_poll() sets the bit when stream is not writeable.
> >> >
> >> >We can avoid two atomic operations by first checking if socket
> >> >is actually interested in the FASYNC business (most sockets in
> >> >real applications do not use AIO, but select()/poll()/epoll())
> >> >
> >> >This also removes one cache line miss to access sk->sk_wq->flags
> >> >in tcp_sendmsg()
> >> >
> >> >Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> >> 
> >> I just bisected down to this. This is causing a regression for me when
> >> my nfs mount becomes stuck. I can easily reproduce this if you need to
> >> test the fix.
> >
> >What do you mean by 'when nfs mount becomes stuck' ?
> >
> >Is this patch making nfs not functional , or does it make recovery from
> >some nfs error bad ?
> 
> I can mount nfs on the host. But when I do something (compile a kernel
> module in my case), it gets stuck. Then I cannot even ssh to the machine.
> No messages in dmesg. I didn't debug it any further. I just bisected and
> verified that this patch caused this behaviour.

Interesting.

It looks like net/sunrpc/xprtsock.c should set SOCK_FASYNC
even if it is not actually using fasync_list

Could you try this quick hack to check if this is the right way ?

Thanks !

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index a6c68dc086af83233ee315642638f4a1990ee622..b90c5397b5e137c6cc8accad6eebe2b876363d4e 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1950,6 +1950,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
 		sk->sk_user_data = xprt;
 		sk->sk_data_ready = xs_data_ready;
 		sk->sk_write_space = xs_udp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_error_report = xs_error_report;
 		sk->sk_allocation = GFP_NOIO;
 
@@ -2136,6 +2137,7 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 		sk->sk_user_data = xprt;
 		sk->sk_data_ready = xs_data_ready;
 		sk->sk_write_space = xs_udp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_allocation = GFP_NOIO;
 
 		xprt_set_connected(xprt);
@@ -2237,6 +2239,7 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
 		sk->sk_data_ready = xs_tcp_data_ready;
 		sk->sk_state_change = xs_tcp_state_change;
 		sk->sk_write_space = xs_tcp_write_space;
+		sock_set_flag(sk, SOCK_FASYNC);
 		sk->sk_error_report = xs_error_report;
 		sk->sk_allocation = GFP_NOIO;