[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <95606cf6-8e48-39a4-4ca1-407e80157c32@tomt.net>
Date: Sun, 6 May 2018 03:06:23 +0200
From: Andre Tomt <andre@...t.net>
To: Dave Watson <davejwatson@...com>,
"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
borisp@...lanox.com, Aviad Yehezkel <aviadye@...lanox.com>
Subject: Re: [PATCH net] net/tls: Don't recursively call push_record during
tls_write_space callbacks
On 01. mai 2018 22:05, Dave Watson wrote:
> It is reported that in some cases, write_space may be called in
> do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again:
>
> [ 660.468802] ? do_tcp_sendpages+0x8d/0x580
> [ 660.468826] ? tls_push_sg+0x74/0x130 [tls]
> [ 660.468852] ? tls_push_record+0x24a/0x390 [tls]
> [ 660.468880] ? tls_write_space+0x6a/0x80 [tls]
> ...
>
> tls_push_sg already does a loop over all sending sg's, so ignore
> any tls_write_space notifications until we are done sending.
> We then have to call the previous write_space to wake up
> poll() waiters after we are done with the send loop.
>
> Reported-by: Andre Tomt <andre@...t.net>
> Signed-off-by: Dave Watson <davejwatson@...com>
Unfortunately it seems like this patch has a bug, while it fixed the
kernel crashing it is causing some connections in my testbed to stall.
Making sure ctx->in_tcp_sendpages is also cleared before the return ret
within the while(1) loop seems to fix it for me.
diff -Naurp a/net/tls/tls_main.c b/net/tls/tls_main.c
--- a/net/tls/tls_main.c 2018-05-06 02:23:41.638597066 +0200
+++ b/net/tls/tls_main.c 2018-05-06 01:59:14.378568139 +0200
@@ -135,6 +135,7 @@ retry:
offset -= sg->offset;
ctx->partially_sent_offset = offset;
ctx->partially_sent_record = (void *)sg;
+ ctx->in_tcp_sendpages = false;
return ret;
}
Powered by blists - more mailing lists