lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231012133407.GA3359458@pengutronix.de>
Date: Thu, 12 Oct 2023 15:34:07 +0200
From: Sascha Hauer <sha@...gutronix.de>
To: Jens Axboe <axboe@...nel.dk>
Cc: Pavel Begunkov <asml.silence@...il.com>, io-uring@...r.kernel.org,
	linux-kernel@...r.kernel.org, kernel@...gutronix.de,
	Boris Pismenny <borisp@...dia.com>,
	John Fastabend <john.fastabend@...il.com>,
	Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org
Subject: Re: Problem with io_uring splice and KTLS

On Tue, Oct 10, 2023 at 08:28:13AM -0600, Jens Axboe wrote:
> On 10/10/23 8:19 AM, Sascha Hauer wrote:
> > Hi,
> > 
> > I am working with a webserver using io_uring in conjunction with KTLS. The
> > webserver basically splices static file data from a pipe to a socket which uses
> > KTLS for encryption. When splice is done the socket is closed. This works fine
> > when using software encryption in KTLS. Things go awry though when the software
> > encryption is replaced with the CAAM driver which replaces the synchronous
> > encryption with a asynchronous queue/interrupt/completion flow.
> > 
> > So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
> > send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
> > more memory on the socket by calling sk_stream_wait_memory(). This in turn
> > returns -ERESTARTSYS due to:
> > 
> >         if (signal_pending(current))
> >                 goto do_interrupted;
> > 
> > The current task has the TIF_NOTIFY_SIGNAL set due to:
> > 
> > io_req_normal_work_add()
> > {
> >         ...
> >         /* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
> >         task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
> > }
> > 
> > The call stack when sk_stream_wait_memory() fails is as follows:
> > 
> > [ 1385.428816]  dump_backtrace+0xa0/0x128
> > [ 1385.432568]  show_stack+0x20/0x38
> > [ 1385.435878]  dump_stack_lvl+0x48/0x60
> > [ 1385.439539]  dump_stack+0x18/0x28
> > [ 1385.442850]  tls_push_sg+0x100/0x238
> > [ 1385.446424]  tls_tx_records+0x118/0x1d8
> > [ 1385.450257]  tls_sw_release_resources_tx+0x74/0x1a0
> > [ 1385.455135]  tls_sk_proto_close+0x2f8/0x3f0
> > [ 1385.459315]  inet_release+0x58/0xb8
> > [ 1385.462802]  inet6_release+0x3c/0x60
> > [ 1385.466374]  __sock_release+0x48/0xc8
> > [ 1385.470035]  sock_close+0x20/0x38
> > [ 1385.473347]  __fput+0xbc/0x280
> > [ 1385.476399]  ____fput+0x18/0x30
> > [ 1385.479537]  task_work_run+0x80/0xe0
> > [ 1385.483108]  io_run_task_work+0x40/0x108
> > [ 1385.487029]  __arm64_sys_io_uring_enter+0x164/0xad8
> > [ 1385.491907]  invoke_syscall+0x50/0x128
> > [ 1385.495655]  el0_svc_common.constprop.0+0x48/0xf0
> > [ 1385.500359]  do_el0_svc_compat+0x24/0x40
> > [ 1385.504279]  el0_svc_compat+0x38/0x108
> > [ 1385.508026]  el0t_32_sync_handler+0x98/0x140
> > [ 1385.512294]  el0t_32_sync+0x194/0x198
> > 
> > So the socket is being closed and KTLS tries to send out the remaining
> > completed messages.  From a splice point of view everything has been sent
> > successfully, but not everything made it through KTLS to the socket and the
> > remaining data is sent while closing the socket.
> > 
> > I vaguely understand what's going on here, but I haven't got the
> > slightest idea what to do about this. Any ideas?
> 
> Two things to try:
> 
> 1) Depending on how you use the ring, set it up with
> IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN. The latter will
> avoid using signal based task_work notifications, which may be messing
> you up here.
> 
> 2) io_uring will hold a reference to the file/socket. I'm unsure if this
> is a problem in the above case, but sometimes it'll prevent the final
> flush.
> 
> Do you have a reproducer that could be run to test? Sometimes easier to
> see what's going on when you can experiment, it'll save some time.

Okay, here is a reproducer:

https://github.com/saschahauer/webserver-uring-test.git

Execute ./prepare.sh in that repository, it will compile the webserver,
generate cert.pem/key.pem and generate some testfile to download. If the
meson build doesn't work for you then you can compile the program by
hand with something like:

gcc -O3 -Wall -o webserver webserver_liburing.c -lcrypto -lssl -luring

When the webserver is started you can get a file from it with:

curl -k https://<ipaddr>:8443/foo -o foo

or:

while true; do curl -k https://<ipaddr>:8443/foo -o foo; if [ $? != 0 ]; then break; fi; done

This should run without problems as by default likely the encryption
requests are running synchronously.

In case you don't have encryption hardware you can create an
asynchronous encryption module using cryptd. Compile a kernel with
CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
webserver with the '-c' option. /proc/crypto should then contain an
entry with:

 name         : gcm(aes)
 driver       : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
 module       : kernel
 priority     : 150

Make sure there is no other module providing gcm(aes) with a priority higher
than 150 so that this one is actually used.

With that the while true loop above should break out with a short read
fairly fast. Passing IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN
to io_uring_queue_init() makes it harder to reproduce for me. With that
I need multiple shells in parallel running the above loop.

The repository also contains a kernel patch which will provide you a
stack dump when KTLS gets an error from tcp_sendmsg_locked().

Now I hope I haven't done anything silly in the webserver ;)

Sascha


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ