lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 30 May 2019 07:46:29 +0000
From:   Horia Geanta <>
To:     Herbert Xu <>,
        Eric Biggers <>
CC:     Iuliana Prodan <>,
        "David S. Miller" <>,
        Ard Biesheuvel <>,
        Sascha Hauer <>,
        "" <>,
        "" <>,
        dl-linux-imx <>
Subject: Re: [PATCH] crypto: gcm - fix cacheline sharing

On 5/30/2019 8:34 AM, Herbert Xu wrote:
> On Wed, May 29, 2019 at 01:27:28PM -0700, Eric Biggers wrote:
>> So what about the other places that also pass an IV located next to the data,
>> like crypto/ccm.c and crypto/adiantum.c?  If we're actually going to make this a
Fix for ccm is WIP.
We were not aware of adiantum since our crypto engine does not accelerate it.

>> new API requirement, then we need to add a debugging option that makes the API
>> detect this violation so that the other places can be fixed too.
IMO this is not a new crypto API requirement.
crypto API and its users must follow DMA API rules, besides crypto-specific ones.

In this particular case, crypto/gcm.c is both an implementation and a crypto API
user, since it uses underneath ctr(aes) (and ghash).
Currently generic gcm implementation is breaking DMA API, since part of the dst
buffer (auth_tag) provided to ctr(aes) is sharing a cache line with some other
data structure (iv).

The DMA API rule is mentioned in Documentation/DMA-API.txt

.. warning::

        Memory coherency operates at a granularity called the cache
        line width.  In order for memory mapped by this API to operate
        correctly, the mapped region must begin exactly on a cache line
        boundary and end exactly on one (to prevent two separately mapped
        regions from sharing a single cache line).

>> Also, doing a kmalloc() per requset is inefficient and very error-prone.  In
>> fact there are at least 3 bugs here: (1) not checking the return value, (2)
>> incorrectly using GFP_KERNEL when it may be atomic context, and (3) not always
For (2) I assume this means checking for CRYPTO_TFM_REQ_MAY_SLEEP flag.

>> freeing the memory.  Why not use cacheline-aligned memory within the request
>> context, so that a separate kmalloc() isn't needed?
If you check previous discussion referenced in the commit message:

or (probably easier) look at the full thread:

you'll see that at some point I proposed changing crypto_gcm_req_priv_ctx struct
as follows:
-       u8 auth_tag[16];
+       u8 auth_tag[16] ____cacheline_aligned;

Ard suggested it would be better to kmalloc the auth_tag.

I am open to changing the fix, however I don't think the problem is in the
implementation (caam driver).

>> Also, did you consider whether there's any way to make the crypto API handle
>> this automatically, so that all the individual users don't have to?
That would probably work, but I guess it would come up with a big overhead.

I am thinking crypto API would have to check each buffer used by src, dst
scatterlists is correctly aligned - starting and ending on cache line boundaries.

> You're absolutely right Eric.
> What I suggested in the old thread is non-sense.  While you can
> force GCM to provide the right pointers you cannot force all the
> other crypto API users to do this.
Whose problem is that crypto API users don't follow the DMA API requirements?

> It would appear that Ard's latest suggestion should fix the problem
> and is the correct approach.
I disagree.
We shouldn't force crypto implementations to be aware of such inconsistencies in
the I/O data buffers (pointed to by src/dst scatterlists) that are supposed to
be safely DMA mapped.


Powered by blists - more mailing lists