[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210525103915.05264e8c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Tue, 25 May 2021 10:39:15 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Maxim Mikityanskiy <maximmi@...dia.com>
Cc: Boris Pismenny <borisp@...dia.com>,
John Fastabend <john.fastabend@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
"David S. Miller" <davem@...emloft.net>,
"Aviad Yehezkel" <aviadye@...dia.com>,
Tariq Toukan <tariqt@...dia.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net 2/2] net/tls: Fix use-after-free after the TLS
device goes down and up
On Mon, 24 May 2021 15:12:20 +0300 Maxim Mikityanskiy wrote:
> When a netdev with active TLS offload goes down, tls_device_down is
> called to stop the offload and tear down the TLS context. However, the
> socket stays alive, and it still points to the TLS context, which is now
> deallocated. If a netdev goes up, while the connection is still active,
> and the data flow resumes after a number of TCP retransmissions, it will
> lead to a use-after-free of the TLS context.
>
> This commit addresses this bug by keeping the context alive until its
> normal destruction, and implements the necessary fallbacks, so that the
> connection can resume in software (non-offloaded) kTLS mode.
>
> On the TX side tls_sw_fallback is used to encrypt all packets. The RX
> side already has all the necessary fallbacks, because receiving
> non-decrypted packets is supported. The thing needed on the RX side is
> to block resync requests, which are normally produced after receiving
> non-decrypted packets.
>
> The necessary synchronization is implemented for a graceful teardown:
> first the fallbacks are deployed, then the driver resources are released
> (it used to be possible to have a tls_dev_resync after tls_dev_del).
>
> A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
> mode. It's used to skip the RX resync logic completely, as it becomes
> useless, and some objects may be released (for example, resync_async,
> which is allocated and freed by the driver).
>
> Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
> Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
> Reviewed-by: Tariq Toukan <tariqt@...dia.com>
> @@ -961,6 +964,17 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx,
>
> ctx->sw.decrypted |= is_decrypted;
>
> + if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags))) {
Why not put the check in tls_device_core_ctrl_rx_resync()?
Would be less code, right?
> + if (likely(is_encrypted || is_decrypted))
> + return 0;
> +
> + /* After tls_device_down disables the offload, the next SKB will
> + * likely have initial fragments decrypted, and final ones not
> + * decrypted. We need to reencrypt that single SKB.
> + */
> + return tls_device_reencrypt(sk, skb);
> + }
> +
> /* Return immediately if the record is either entirely plaintext or
> * entirely ciphertext. Otherwise handle reencrypt partially decrypted
> * record.
Powered by blists - more mailing lists