[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC6WdO4E6ugCm42K@gondor.apana.org.au>
Date: Thu, 22 May 2025 11:13:56 +0800
From: Herbert Xu <herbert@...dor.apana.org.au>
To: Arnd Bergmann <arnd@...db.de>
Cc: Corentin Labbe <clabbe.montjoie@...il.com>,
Klaus Kudielka <klaus.kudielka@...il.com>,
Eric Biggers <ebiggers@...nel.org>, regressions@...ts.linux.dev,
linux-kernel@...r.kernel.org,
Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
"'bbrezillon@...nel.org'" <bbrezillon@...nel.org>,
EBALARD Arnaud <Arnaud.Ebalard@....gouv.fr>,
Romain Perier <romain.perier@...il.com>
Subject: Re: [PATCH] crypto: marvell/cesa - Avoid empty transfer descriptor
On Wed, May 21, 2025 at 01:36:06PM +0200, Arnd Bergmann wrote:
>
> Ok. Which SoC exactly is this on? Armada XP or Armada 385?
I have no idea :) Corentin, can you tell Arnd what you're using
for the tests?
> I see. Just a few more ideas what it could be in case it's not
> what you suspect:
It appears to be proven now. After replacing the coherent memroy
with kmalloc + dma_map_single, it actually works reliably to pass
all the crypto fuzz tests (hmac is still failing but I think that's
a different issue).
Maybe I'm misreading the code, but I thought that dma_map_single on
this machine should actually be a no-op because the device is marked
as coherent? Arnd, Christophe?
But it's clearly making a difference. My suspicion arose because
the fully linear hash digests where all data came from the user
were never failing. It was only failing when some of the data was
coming from the bounce buffer within the driver. That bounce buffer
was setup with dma_map_pool (each being 64 bytes long).
> - the SRAM gets mapped into kernel space using ioremap(), which
> on Armada 375/38x uses MT_UNCACHED rather than MT_DEVICE as
> a workaround for a possible deadlock on actual MMIO registers.
> It's possible that the SRAM should be mapped using a different
> map flag to ensure it's actually consistent. If a store is
> posted to the SRAM, it may still be in flight at the time that
> the DMA master looks at it.
AFAICS we're not touching the SRAM directly. Everything is mediated
with the tdma unit on the cesa device. So the cesa tdma unit is
copying the data to and from the SRAM with DMA.
> - I see a lot of chaining of DMA descriptors, but no dma_wmb()
> or spinlock. A dma_wmb() or stronger (wmb, dma_mb, mb)
> is probably required between writing to a coherent descriptor
> and making it visible from another one. A spinlock is
> of course needed if you have multiple CPUs adding data
> into a shared linked list (I think this one is not shared
> but haven't confirmed that).
Yes that chaining is definitely broken. However, I don't think
it's causing the corruption. We've already tried disabling the
chaining and it makes no difference whatsoever.
In any case the tip of cryptodev now disables the broken chaining
so it should no longer be an issue going forward.
Thanks,
--
Email: Herbert Xu <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Powered by blists - more mailing lists