lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 21 Apr 2022 15:47:30 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Arnd Bergmann <arnd@...db.de>,
        Christoph Hellwig <hch@...radead.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Will Deacon <will@...nel.org>, Marc Zyngier <maz@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: [PATCH 07/10] crypto: Use ARCH_DMA_MINALIGN instead of ARCH_KMALLOC_MINALIGN

On Thu, Apr 21, 2022 at 3:25 PM Catalin Marinas <catalin.marinas@....com> wrote:
> On Thu, Apr 21, 2022 at 02:28:45PM +0200, Arnd Bergmann wrote:
> > We also know that larger slabs are all cacheline aligned, so simply
> > comparing the transfer size is enough to rule out most, in this case
> > any transfer larger than 96 bytes must come from the kmalloc-128
> > or larger cache, so that works like before.
>
> There's also the case with 128-byte cache lines and kmalloc-192.

Sure, but that's much less common, as the few machines with 128 byte
cache lines tend to also have cache coherent devices IIRC, so we'd
skip the bounce buffer entirely.

> > For transfers <=96 bytes, the possibilities are:
> >
> > 1.kmalloc-32 or smaller, always needs to bounce
> > 2. kmalloc-96, but at least one byte in partial cache line,
> >     need to bounce
> > 3. kmalloc-64, may skip the bounce.
> > 4. kmalloc-128 or larger, or not a slab cache but a partial
> >     transfer, may skip the bounce.
> >
> > I would guess that the first case is the most common here,
> > so unless bouncing one or two cache lines is extremely
> > expensive, I don't expect it to be worth optimizing for the latter
> > two cases.
>
> I think so. If someone complains of a performance regression, we can
> look at optimising the bounce. I have a suspicion the cost of copying
> two cache lines is small compared to swiotlb_find_slots() etc.

That is possible, and we'd definitely have to watch out for
performance regressions, I'm just skeptical that the cases that
suffer from the extra bouncer buffering on 33..64 byte allocations
benefit much from having a special case if the 1...32 and 65..96
byte allocations are still slow.

Another simpler way to do this might be to just not create the
kmalloc-96 (or kmalloc-192) caches, and assuming that any
transfer >=33 (or 65) bytes is safe.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ