lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 21 Apr 2022 14:28:45 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Christoph Hellwig <hch@...radead.org>,
        Arnd Bergmann <arnd@...db.de>,
        Ard Biesheuvel <ardb@...nel.org>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Will Deacon <will@...nel.org>, Marc Zyngier <maz@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN

On Thu, Apr 21, 2022 at 1:06 PM Catalin Marinas <catalin.marinas@....com> wrote:
>
> On Thu, Apr 21, 2022 at 12:20:22AM -0700, Christoph Hellwig wrote:
> > Btw, there is another option:  Most real systems already require having
> > swiotlb to bounce buffer in some cases.  We could simply force bounce
> > buffering in the dma mapping code for too small or not properly aligned
> > transfers and just decrease the dma alignment.
>
> We can force bounce if size is small but checking the alignment is
> trickier. Normally the beginning of the buffer is aligned but the end is
> at some sizeof() distance. We need to know whether the end is in a
> kmalloc-128 cache and that requires reaching out to the slab internals.
> That's doable and not expensive but it needs to be done for every small
> size getting to the DMA API, something like (for mm/slub.c):
>
>         folio = virt_to_folio(x);
>         slab = folio_slab(folio);
>         if (slab->slab_cache->align < ARCH_DMA_MINALIGN)
>                 ... bounce ...
>
> (and a bit different for mm/slab.c)

I think the decision to bounce or not can be based on the actual
cache line size at runtime, so most commonly 64 bytes on arm64,
even though the compile-time limit is 128 bytes.

We also know that larger slabs are all cacheline aligned, so simply
comparing the transfer size is enough to rule out most, in this case
any transfer larger than 96 bytes must come from the kmalloc-128
or larger cache, so that works like before.

For transfers <=96 bytes, the possibilities are:

1.kmalloc-32 or smaller, always needs to bounce
2. kmalloc-96, but at least one byte in partial cache line,
    need to bounce
3. kmalloc-64, may skip the bounce.
4. kmalloc-128 or larger, or not a slab cache but a partial
    transfer, may skip the bounce.

I would guess that the first case is the most common here,
so unless bouncing one or two cache lines is extremely
expensive, I don't expect it to be worth optimizing for the latter
two cases.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ