lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 8 Aug 2019 11:59:18 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>,
        John Fastabend <john.fastabend@...il.com>
Cc:     David Miller <davem@...emloft.net>,
        Network Development <netdev@...r.kernel.org>,
        davejwatson@...com, borisp@...lanox.com, aviadye@...lanox.com,
        Daniel Borkmann <daniel@...earbox.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        oss-drivers@...ronome.com
Subject: Re: [PATCH net v3] net/tls: prevent skb_orphan() from leaking TLS
 plain text with offload

On Wed, Aug 7, 2019 at 8:04 PM Jakub Kicinski
<jakub.kicinski@...ronome.com> wrote:
>
> sk_validate_xmit_skb() and drivers depend on the sk member of
> struct sk_buff to identify segments requiring encryption.
> Any operation which removes or does not preserve the original TLS
> socket such as skb_orphan() or skb_clone() will cause clear text
> leaks.
>
> Make the TCP socket underlying an offloaded TLS connection
> mark all skbs as decrypted, if TLS TX is in offload mode.
> Then in sk_validate_xmit_skb() catch skbs which have no socket
> (or a socket with no validation) and decrypted flag set.
>
> Note that CONFIG_SOCK_VALIDATE_XMIT, CONFIG_TLS_DEVICE and
> sk->sk_validate_xmit_skb are slightly interchangeable right now,
> they all imply TLS offload. The new checks are guarded by
> CONFIG_TLS_DEVICE because that's the option guarding the
> sk_buff->decrypted member.
>
> Second, smaller issue with orphaning is that it breaks
> the guarantee that packets will be delivered to device
> queues in-order. All TLS offload drivers depend on that
> scheduling property. This means skb_orphan_partial()'s
> trick of preserving partial socket references will cause
> issues in the drivers. We need a full orphan, and as a
> result netem delay/throttling will cause all TLS offload
> skbs to be dropped.
>
> Reusing the sk_buff->decrypted flag also protects from
> leaking clear text when incoming, decrypted skb is redirected
> (e.g. by TC).
>
> See commit 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect
> through ULP") for justification why the internal flag is safe.
> The only location which could leak the flag in is tcp_bpf_sendmsg(),
> which is taken care of by clearing the previously unused bit.
>
> v2:
>  - remove superfluous decrypted mark copy (Willem);
>  - remove the stale doc entry (Boris);
>  - rely entirely on EOR marking to prevent coalescing (Boris);
>  - use an internal sendpages flag instead of marking the socket
>    (Boris).
> v3 (Willem):
>  - reorganize the can_skb_orphan_partial() condition;
>  - fix the flag leak-in through tcp_bpf_sendmsg.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
> ---


> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 3d1e15401384..8a56e09cfb0e 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -398,10 +398,14 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
>  static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>  {
>         struct sk_msg tmp, *msg_tx = NULL;
> -       int flags = msg->msg_flags | MSG_NO_SHARED_FRAGS;
>         int copied = 0, err = 0;
>         struct sk_psock *psock;
>         long timeo;
> +       int flags;
> +
> +       /* Don't let internal do_tcp_sendpages() flags through */
> +       flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED);
> +       flags |= MSG_NO_SHARED_FRAGS;

Not for this patch, but for tcp_bpf itself: should this more
aggressively filter flags?

Both those that are valid in sendmsg, but not expected in sendpage,
and other internal flags passed to sendpage, but should never be
passable from userspace.

> diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
> index 7c0b2b778703..43922d86e510 100644
> --- a/net/tls/tls_device.c
> +++ b/net/tls/tls_device.c
> @@ -373,9 +373,9 @@ static int tls_push_data(struct sock *sk,
>         struct tls_context *tls_ctx = tls_get_ctx(sk);
>         struct tls_prot_info *prot = &tls_ctx->prot_info;
>         struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx);
> -       int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
>         int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
>         struct tls_record_info *record = ctx->open_record;
> +       int tls_push_record_flags;
>         struct page_frag *pfrag;
>         size_t orig_size = size;
>         u32 max_open_record_len;
> @@ -390,6 +390,9 @@ static int tls_push_data(struct sock *sk,
>         if (sk->sk_err)
>                 return -sk->sk_err;
>
> +       flags |= MSG_SENDPAGE_DECRYPTED;
> +       tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
> +

Without being too familiar with this code: can this plaintext flag be
set once, closer to the call to do_tcp_sendpages, in tls_push_sg?

Instead of two locations with multiple non-trivial codepaths between
them and do_tcp_sendpages.

Or are there paths where the flag is not set? Which I imagine would
imply already passing s/w encrypted ciphertext.

>         timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>         if (tls_is_partially_sent_record(tls_ctx)) {
>                 rc = tls_push_partial_record(sk, tls_ctx, flags);
> @@ -576,7 +579,9 @@ void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
>                 gfp_t sk_allocation = sk->sk_allocation;
>
>                 sk->sk_allocation = GFP_ATOMIC;
> -               tls_push_partial_record(sk, ctx, MSG_DONTWAIT | MSG_NOSIGNAL);
> +               tls_push_partial_record(sk, ctx,
> +                                       MSG_DONTWAIT | MSG_NOSIGNAL |
> +                                       MSG_SENDPAGE_DECRYPTED);
>                 sk->sk_allocation = sk_allocation;
>         }
>  }
> --
> 2.21.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ