netdev - Re: [PATCH net 2/2] net/tls: Fix use-after-free after the TLS device goes down and up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210525103915.05264e8c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Tue, 25 May 2021 10:39:15 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Maxim Mikityanskiy <maximmi@...dia.com>
Cc:     Boris Pismenny <borisp@...dia.com>,
        John Fastabend <john.fastabend@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        "David S. Miller" <davem@...emloft.net>,
        "Aviad Yehezkel" <aviadye@...dia.com>,
        Tariq Toukan <tariqt@...dia.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net 2/2] net/tls: Fix use-after-free after the TLS
 device goes down and up

On Mon, 24 May 2021 15:12:20 +0300 Maxim Mikityanskiy wrote:
> When a netdev with active TLS offload goes down, tls_device_down is
> called to stop the offload and tear down the TLS context. However, the
> socket stays alive, and it still points to the TLS context, which is now
> deallocated. If a netdev goes up, while the connection is still active,
> and the data flow resumes after a number of TCP retransmissions, it will
> lead to a use-after-free of the TLS context.
> 
> This commit addresses this bug by keeping the context alive until its
> normal destruction, and implements the necessary fallbacks, so that the
> connection can resume in software (non-offloaded) kTLS mode.
> 
> On the TX side tls_sw_fallback is used to encrypt all packets. The RX
> side already has all the necessary fallbacks, because receiving
> non-decrypted packets is supported. The thing needed on the RX side is
> to block resync requests, which are normally produced after receiving
> non-decrypted packets.
> 
> The necessary synchronization is implemented for a graceful teardown:
> first the fallbacks are deployed, then the driver resources are released
> (it used to be possible to have a tls_dev_resync after tls_dev_del).
> 
> A new flag called TLS_RX_DEV_DEGRADED is added to indicate the fallback
> mode. It's used to skip the RX resync logic completely, as it becomes
> useless, and some objects may be released (for example, resync_async,
> which is allocated and freed by the driver).
> 
> Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
> Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
> Reviewed-by: Tariq Toukan <tariqt@...dia.com>

> @@ -961,6 +964,17 @@ int tls_device_decrypted(struct sock *sk, struct tls_context *tls_ctx,
>  
>  	ctx->sw.decrypted |= is_decrypted;
>  
> +	if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags))) {

Why not put the check in tls_device_core_ctrl_rx_resync()?
Would be less code, right?

> +		if (likely(is_encrypted || is_decrypted))
> +			return 0;
> +
> +		/* After tls_device_down disables the offload, the next SKB will
> +		 * likely have initial fragments decrypted, and final ones not
> +		 * decrypted. We need to reencrypt that single SKB.
> +		 */
> +		return tls_device_reencrypt(sk, skb);
> +	}
> +
>  	/* Return immediately if the record is either entirely plaintext or
>  	 * entirely ciphertext. Otherwise handle reencrypt partially decrypted
>  	 * record.