lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 30 May 2019 10:08:22 +0200
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Horia Geanta <horia.geanta@....com>
Cc:     Herbert Xu <herbert@...dor.apana.org.au>,
        Eric Biggers <ebiggers@...nel.org>,
        Iuliana Prodan <iuliana.prodan@....com>,
        "David S. Miller" <davem@...emloft.net>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        dl-linux-imx <linux-imx@....com>
Subject: Re: [PATCH] crypto: gcm - fix cacheline sharing

On Thu, 30 May 2019 at 09:46, Horia Geanta <horia.geanta@....com> wrote:
>
> On 5/30/2019 8:34 AM, Herbert Xu wrote:
> > On Wed, May 29, 2019 at 01:27:28PM -0700, Eric Biggers wrote:
> >>
> >> So what about the other places that also pass an IV located next to the data,
> >> like crypto/ccm.c and crypto/adiantum.c?  If we're actually going to make this a
> Fix for ccm is WIP.
> We were not aware of adiantum since our crypto engine does not accelerate it.
>
> >> new API requirement, then we need to add a debugging option that makes the API
> >> detect this violation so that the other places can be fixed too.
> >>
> IMO this is not a new crypto API requirement.
> crypto API and its users must follow DMA API rules, besides crypto-specific ones.
>
> In this particular case, crypto/gcm.c is both an implementation and a crypto API
> user, since it uses underneath ctr(aes) (and ghash).
> Currently generic gcm implementation is breaking DMA API, since part of the dst
> buffer (auth_tag) provided to ctr(aes) is sharing a cache line with some other
> data structure (iv).
>
> The DMA API rule is mentioned in Documentation/DMA-API.txt
>
> .. warning::
>
>         Memory coherency operates at a granularity called the cache
>         line width.  In order for memory mapped by this API to operate
>         correctly, the mapped region must begin exactly on a cache line
>         boundary and end exactly on one (to prevent two separately mapped
>         regions from sharing a single cache line).
>
>

This is overly restrictive, and not in line with reality. The whole
networking stack operates on buffers shifted by 2 bytes if
NET_IP_ALIGN is left at its default value of 2. There are numerous
examples in other places as well.

Given that kmalloc() will take the cacheline granularity into account
if necessary, the only way this issue can hit is when a single kmalloc
buffer is written to by two different masters.

> >> Also, doing a kmalloc() per requset is inefficient and very error-prone.  In
> >> fact there are at least 3 bugs here: (1) not checking the return value, (2)
> >> incorrectly using GFP_KERNEL when it may be atomic context, and (3) not always
> For (2) I assume this means checking for CRYPTO_TFM_REQ_MAY_SLEEP flag.
>
> >> freeing the memory.  Why not use cacheline-aligned memory within the request
> >> context, so that a separate kmalloc() isn't needed?
> >>
> If you check previous discussion referenced in the commit message:
> Link:
> https://lore.kernel.org/linux-crypto/20190208114459.5nixe76xmmkhur75@gondor.apana.org.au/
>
> or (probably easier) look at the full thread:
> https://patchwork.kernel.org/patch/10789697/
>
> you'll see that at some point I proposed changing crypto_gcm_req_priv_ctx struct
> as follows:
> -       u8 auth_tag[16];
> +       u8 auth_tag[16] ____cacheline_aligned;
>
> Ard suggested it would be better to kmalloc the auth_tag.
>
> I am open to changing the fix, however I don't think the problem is in the
> implementation (caam driver).
>

I remember that. But in the discussion that followed, I did ask about
accessing the memory while the buffer is mapped for DMA, and I
misunderstood the response. The scatterwalk_map_and_copy writes to the
request while it is mapped for DMA.

> >> Also, did you consider whether there's any way to make the crypto API handle
> >> this automatically, so that all the individual users don't have to?
> That would probably work, but I guess it would come up with a big overhead.
>
> I am thinking crypto API would have to check each buffer used by src, dst
> scatterlists is correctly aligned - starting and ending on cache line boundaries.
>
> >
> > You're absolutely right Eric.
> >
> > What I suggested in the old thread is non-sense.  While you can
> > force GCM to provide the right pointers you cannot force all the
> > other crypto API users to do this.
> >
> Whose problem is that crypto API users don't follow the DMA API requirements?
>
> > It would appear that Ard's latest suggestion should fix the problem
> > and is the correct approach.
> >
> I disagree.
> We shouldn't force crypto implementations to be aware of such inconsistencies in
> the I/O data buffers (pointed to by src/dst scatterlists) that are supposed to
> be safely DMA mapped.
>

I'm on the fence here. On the one hand, it is slightly dodgy for the
GCM driver to pass a scatterlist referencing a buffer that shares a
cacheline with another buffer passed by an ordinary pointer, and for
which an explicit requirement exists that the callee should update it
before returning.

On the other hand, I think it is reasonable to require drivers not to
perform such updates while the scatterlist is mapped for DMA, since
fixing it in the callers puts a disproportionate burden on them, given
that non-coherent DMA only represents a small minority of use cases.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ