[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZPm__x5TcsmqagBH@hog>
Date: Thu, 7 Sep 2023 14:21:59 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Jakub Kicinski <kuba@...nel.org>,
Herbert Xu <herbert@...dor.apana.org.au>
Cc: netdev@...r.kernel.org, Dave Watson <davejwatson@...com>,
Vakul Garg <vakul.garg@....com>, Boris Pismenny <borisp@...dia.com>,
John Fastabend <john.fastabend@...il.com>
Subject: Re: [PATCH net 5/5] tls: don't decrypt the next record if it's of a
different type
2023-09-06, 20:47:27 -0700, Jakub Kicinski wrote:
> On Wed, 6 Sep 2023 19:08:35 +0200 Sabrina Dubroca wrote:
> > If the next record is of a different type, we won't copy it to
> > userspace in this round, tls_record_content_type will stop us just
> > after decryption. Skip decryption until the next recvmsg() call.
> >
> > This fixes a use-after-free when a data record is decrypted
> > asynchronously but doesn't fill the userspace buffer, and the next
> > record is non-data, for example in the bad_cmsg selftest.
>
> What's the UAF on?
It doesn't always happen unless I set cryptd_delay_ms from my debug
patch (10 is enough):
https://patchwork.kernel.org/project/linux-crypto/patch/9d664093b1bf7f47497b2c40b3a085b45f3274a2.1694021240.git.sd@queasysnail.net/
UAF is on the crypto_async_request (part of the aead_request,
allocated in the big kmalloc in tls_decrypt_sg), mostly caught when
cryptd_queue_worker calls crypto_request_complete, but sometimes a bit
earlier (in crypto_dequeue_request).
I'll admit this patch is papering over the issue a bit, but decrypting
a record we know we won't read within this recv() call seems a bit
pointless.
I wonder if the way we're using ctx->async_wait here is correct. I'm
observing crypto_wait_req return 0 even though the decryption hasn't
run yet (and it should return -EBADMSG, not 0). I guess
tls_decrypt_done calls the completion (since we only had one
decrypt_pending), and then crypto_wait_req thinks everything is
already done.
Adding a fresh crypto_wait in tls_do_decryption (DECLARE_CRYPTO_WAIT)
and using it in the !darg->async case also seems to fix the UAF (but
makes the bad_cmsg test case fail in the same way as what I wrote in
the cover letter for bad_in_large_read -- not decrypting the next
message at all makes the selftest pass).
Herbert, WDYT? We're calling tls_do_decryption twice from the same
tls_sw_recvmsg invocation, first with darg->async = true, then with
darg->async = false. Is it ok to use ctx->async_wait for both, or do
we need a fresh one as in this patch?
-------- 8< --------
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 86b835b15872..ad51960f2864 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -246,6 +246,7 @@ static int tls_do_decryption(struct sock *sk,
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_prot_info *prot = &tls_ctx->prot_info;
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
+ DECLARE_CRYPTO_WAIT(wait);
int ret;
aead_request_set_tfm(aead_req, ctx->aead_recv);
@@ -262,7 +263,7 @@ static int tls_do_decryption(struct sock *sk,
} else {
aead_request_set_callback(aead_req,
CRYPTO_TFM_REQ_MAY_BACKLOG,
- crypto_req_done, &ctx->async_wait);
+ crypto_req_done, &wait);
}
ret = crypto_aead_decrypt(aead_req);
@@ -270,7 +271,7 @@ static int tls_do_decryption(struct sock *sk,
if (darg->async)
return 0;
- ret = crypto_wait_req(ret, &ctx->async_wait);
+ ret = crypto_wait_req(ret, &wait);
}
darg->async = false;
--
Sabrina
Powered by blists - more mailing lists