netdev - Re: [RFC 1/1] net/tls(TLS_SW): Handle -ENOSPC error return from device/AES-NI

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201007134746.069d7f2f@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Wed, 7 Oct 2020 13:47:46 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Pooja Trivedi <poojatrivedi@...il.com>
Cc:     linux-crypto@...r.kernel.org,
        Mallesham Jatharakonda <mallesh537@...il.com>,
        Josh Tway <josh.tway@...ckpath.com>, netdev@...r.kernel.org
Subject: Re: [RFC 1/1] net/tls(TLS_SW): Handle -ENOSPC error return from
 device/AES-NI

On Wed, 7 Oct 2020 15:19:47 -0400 Pooja Trivedi wrote:
> When an -ENOSPC error code is returned by the crypto device or AES-NI
> layer, TLS SW path sets an EBADMSG on the socket causing the
> application to fail.  In an attempt to address the -ENOSPC in the TLS
> SW path, changes were made in tls_sw_sendpage path to trim current
> payload off the plain and encrypted messages, and send a 'zero bytes
> copied' return to the application.  The following diff shows those
> changes:
> 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 9a3d9fedd7aa..4dce4668cb07 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -762,7 +762,7 @@ static int tls_push_record(struct sock *sk, int flags,
>   rc = tls_do_encryption(sk, tls_ctx, ctx, req,
>         msg_pl->sg.size + prot->tail_size, i);
>   if (rc < 0) {
> -               if (rc != -EINPROGRESS) {
> +              if ((rc != -EINPROGRESS) && (rc != -ENOSPC)) {
>                             tls_err_abort(sk, EBADMSG);
>                             if (split) {
> 
> tls_ctx->pending_open_record_frags = true;
> 
> @@ -1073,8 +1073,15 @@ int tls_sw_sendmsg(struct sock *sk, struct
> msghdr *msg, size_t size)
>   else if (ret == -ENOMEM)
>                goto wait_for_memory;
>   else if (ret != -EAGAIN) {
> -              if (ret == -ENOSPC)
> +             if (ret == -ENOSPC) {
>                           ret = 0;
>                           copied -= try_to_copy;
>                           iov_iter_revert(&msg->msg_iter,
> msg_pl->sg.size - orig_size);
>                           tls_trim_both_msgs(sk, orig_size);
>                }
>                goto send_end;
>      }
>   }
> 
> @@ -1150,6 +1157,7 @@ static int tls_sw_do_sendpage(struct sock *sk,
> struct page *page,
>   ssize_t copied = 0;
>   bool full_record;
>   int record_room;
> + int orig_size;
>   int ret = 0;
>   bool eor;
> 
> @@ -1175,7 +1183,7 @@ static int tls_sw_do_sendpage(struct sock *sk,
> struct page *page,
>   }
> 
>   msg_pl = &rec->msg_plaintext;
> -
> + orig_size = msg_pl->sg.size;
>   full_record = false;
>   record_room = TLS_MAX_PAYLOAD_SIZE - msg_pl->sg.size;
>   copy = size;
> 
> @@ -1219,8 +1227,12 @@ static int tls_sw_do_sendpage(struct sock *sk,
> struct page *page,
> 
>   else if (ret == -ENOMEM)
>              goto wait_for_memory;
>   else if (ret != -EAGAIN) {
> -            if (ret == -ENOSPC)
> +           if (ret == -ENOSPC) {
> +                      tls_trim_both_msgs(sk, orig_size);
>                         copied -= copy;
>                         ret = 0;
>               }
>               goto sendpage_end;
>    }
>   }
> 
> 
> However, when above patch was tried, the application tried to send
> remaining data based on the offset as expected, but encryption failed
> due to data integrity issues.  Further debugging revealed that
> sk_msg_trim(), called by tls_trim_both_msgs() does not update the page
> frag offset.  It seems like sk_msg_trim() needs to subtract the trim
> length from the pfrag->offset corresponding to how the sk_msg_alloc()
> call increments the pfrag-->offset with the length it charges to the
> socket via sk_mem_charge().
> When sk_msg_trim() calls sk_mem_uncharge() to uncharge trim length off
> the socket, it should also perform
>            pfrag->offset -= trim;

Let's CC netdev. It's a bit surprising to me that pfrag->offset matters
here, we're basically "wasting" a bit of the page but why would that
cause data integrity issues?

> While the sk_msg_trim() pfrag->offset subtraction change hasn't been
> tried yet, the following hack in TLS layer has been tried and has
> correctly worked. This proves that the above observation/theory
> regarding pfrag->offset update would be the fix:
> 
> 
> +                                       if (ret == -ENOSPC) {
> +                                               struct page_frag *pfrag;
> +                                               tls_trim_both_msgs(sk,
> orig_size);
> +
> +                                               copied -= copy;
> +                                               pfrag = sk_page_frag(sk);
> +                                               pfrag->offset -= copy;
> 
> 
> What are your thoughts on the best way to fix the issue?
> sk_msg_trim() seems like the most logical place, but
> suggestions/comments/questions are welcome.
> 
> Another thing to think about is whether -EBUSY should be handled
> similarly. Vendors have differences and the conditions under which
> these error codes are returned are not very consistent when the sidecar
> device path is involved.

Why would the driver return EBUSY from an async API? What's the caller
supposed to do with that?


While we have you - weren't you sending a sendpage() fix at some point?
Did that get lost?