[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190530164001.35a26331@cakuba.netronome.com>
Date: Thu, 30 May 2019 16:40:01 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: davem@...emloft.net
Cc: netdev@...r.kernel.org, oss-drivers@...ronome.com,
borisp@...lanox.com, alexei.starovoitov@...il.com,
Dirk van der Merwe <dirk.vandermerwe@...ronome.com>
Subject: Re: [PATCH net 1/3] net/tls: avoid NULL-deref on resync during
device removal
On Tue, 21 May 2019 19:02:00 -0700, Jakub Kicinski wrote:
> When netdev with active kTLS sockets in unregistered
> notifier callback walks the offloaded sockets and
> cleans up offload state. RX data may still be processed,
> however, and if resync was requested prior to device
> removal we would hit a NULL pointer dereference on
> ctx->netdev use.
>
> Make sure resync is under the device offload lock
> and NULL-check the netdev pointer.
>
> This should be safe, because the pointer is set to
> NULL either in the netdev notifier (under said lock)
> or when socket is completely dead and no resync can
> happen.
>
> The other access to ctx->netdev in tls_validate_xmit_skb()
> does not dereference the pointer, it just checks it against
> other device pointer, so it should be pretty safe (perhaps
> we can add a READ_ONCE/WRITE_ONCE there, if paranoid).
>
> Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload")
> Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@...ronome.com>
> ---
> net/tls/tls_device.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index ca54a7c7ec81..aa33e4accc32 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -553,8 +553,8 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
> void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
> {
> struct tls_context *tls_ctx = tls_get_ctx(sk);
> - struct net_device *netdev = tls_ctx->netdev;
> struct tls_offload_context_rx *rx_ctx;
> + struct net_device *netdev;
> u32 is_req_pending;
> s64 resync_req;
> u32 req_seq;
> @@ -568,10 +568,15 @@ void handle_device_resync(struct sock *sk, u32 seq, u64 rcd_sn)
> is_req_pending = resync_req;
>
> if (unlikely(is_req_pending) && req_seq == seq &&
> - atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0))
> - netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk,
> - seq + TLS_HEADER_SIZE - 1,
> - rcd_sn);
> + atomic64_try_cmpxchg(&rx_ctx->resync_req, &resync_req, 0)) {
> + seq += TLS_HEADER_SIZE - 1;
> + down_read(&device_offload_lock);
Sorry this may actually cause a sleep in atomic, turns out resync may
get called directly from softirq under certain conditions.
Would it be possible to drop this from stable? I can post a revert +
new fix (probably on a refcount..) or should I post an incremental fix?
> + netdev = tls_ctx->netdev;
> + if (netdev)
> + netdev->tlsdev_ops->tls_dev_resync_rx(netdev, sk, seq,
> + rcd_sn);
> + up_read(&device_offload_lock);
> + }
> }
>
> static int tls_device_reencrypt(struct sock *sk, struct sk_buff *skb)
Powered by blists - more mailing lists