netdev - Re: [PATCH net 1/3] net/tls: avoid NULL-deref on resync during device removal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20190530164001.35a26331@cakuba.netronome.com>
Date:   Thu, 30 May 2019 16:40:01 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     davem@...emloft.net
Cc:     netdev@...r.kernel.org, oss-drivers@...ronome.com,
        borisp@...lanox.com, alexei.starovoitov@...il.com,
        Dirk van der Merwe <dirk.vandermerwe@...ronome.com>
Subject: Re: [PATCH net 1/3] net/tls: avoid NULL-deref on resync during
 device removal

On Tue, 21 May 2019 19:02:00 -0700, Jakub Kicinski wrote:
> When netdev with active kTLS sockets in unregistered
> notifier callback walks the offloaded sockets and
> cleans up offload state.  RX data may still be processed,
> however, and if resync was requested prior to device
> removal we would hit a NULL pointer dereference on
> ctx->netdev use.
> 
> Make sure resync is under the device offload lock
> and NULL-check the netdev pointer.
> 
> This should be safe, because the pointer is set to
> NULL either in the netdev notifier (under said lock)
> or when socket is completely dead and no resync can
> happen.
> 
> The other access to ctx->netdev in tls_validate_xmit_skb()
> does not dereference the pointer, it just checks it against
> other device pointer, so it should be pretty safe (perhaps
> we can add a READ_ONCE/WRITE_ONCE there, if paranoid).
> 
> Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload")
> Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@...ronome.com>
> ---
>  net/tls/tls_device.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index ca54a7c7ec81..aa33e4accc32 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -553,8 +553,8 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
>  void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
>  {
>  	struct tls_context *tls_ctx = tls_get_ctx(sk);
> -	struct net_device *netdev = tls_ctx->netdev;
>  	struct tls_offload_context_rx *rx_ctx;
> +	struct net_device *netdev;
>  	u32 is_req_pending;
>  	s64 resync_req;
>  	u32 req_seq;
> @@ -568,10 +568,15 @@ void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
>  	is_req_pending = resync_req;
>  
>  	if (unlikely(is_req_pending) && req_seq == seq &&
> -	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
> -		netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk,
> -						      seq + TLS_HEADER_SIZE - 1,
> -						      rcd_sn);
> +	    atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0)) {
> +		seq += TLS_HEADER_SIZE - 1;
> +		down_read(&device_offload_lock);

Sorry this may actually cause a sleep in atomic, turns out resync may
get called directly from softirq under certain conditions.

Would it be possible to drop this from stable?  I can post a revert +
new fix (probably on a refcount..) or should I post an incremental fix?

> +		netdev = tls_ctx->netdev;
> +		if (netdev)
> +			netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk, seq,
> +							      rcd_sn);
> +		up_read(&device_offload_lock);
> +	}
>  }
>  
>  static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)