lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEwHPjxw4K6t5mgf@gcabiddu-mobl.ger.corp.intel.com>
Date: Fri, 13 Jun 2025 12:10:54 +0100
From: Giovanni Cabiddu <giovanni.cabiddu@...el.com>
To: Eric Biggers <ebiggers@...nel.org>
CC: Simon Richter <Simon.Richter@...yros.de>, <linux-fscrypt@...r.kernel.org>,
	<linux-crypto@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-mtd@...ts.infradead.org>, <linux-ext4@...r.kernel.org>,
	<linux-f2fs-devel@...ts.sourceforge.net>, <ceph-devel@...r.kernel.org>
Subject: Re: [PATCH] fscrypt: don't use hardware offload Crypto API drivers

On Fri, Jun 13, 2025 at 01:23:57AM +0000, Eric Biggers wrote:
> On Thu, Jun 12, 2025 at 03:57:43PM +0000, Eric Biggers wrote:
> > On Thu, Jun 12, 2025 at 09:50:26AM +0100, Giovanni Cabiddu wrote:
> > > On Wed, Jun 11, 2025 at 11:25:21PM -0700, Eric Biggers wrote:
> > > 
> > > ...
> > > 
> > > > FWIW, here's what happens if you try to use the Intel QAT driver with dm-crypt:
> > > > https://lore.kernel.org/r/CACsaVZ+mt3CfdXV0_yJh7d50tRcGcRZ12j3n6-hoX2cz3+njsg@mail.gmail.com/
> > > 
> > > /s/happens/happened/
> > > 
> > > ... and it got fixed
> > > https://lore.kernel.org/all/20220506082327.21605-1-giovanni.cabiddu@intel.com/
> > 
> > But it reached users in the first place, including stable kernels.  And
> > apparently the issues were going on for years and were known to the authors of
> > the driver
> > (https://lore.kernel.org/linux-crypto/91fe9f87-54d7-4140-4d1a-eac8e2081a7c@gmail.com/).
> > 
> > We simply don't have issues like this with the AES-NI or VAES XTS code.
> > 
> > And separately, QAT was reported to be much slower than AES-NI for synchronous use
> > (https://lore.kernel.org/linux-crypto/0171515-7267-624-5a22-238af829698f@redhat.com/)
> > 
> > Later, I added VAES accelerated AES-XTS code which is over twice as fast as
> > AES-NI on the latest Intel CPUs, so that likely widened the gap even more.
> > 
> > Yet, the QAT driver registers its "xts(aes)" implementation with priority 4001,
> > compared to priority 800 for the VAES accelerated one.  So the QAT one is the
> > one that will be used by fscrypt!
> > 
> > That seems like a major issue even just from a performance perspective.
> > 
> > I expect this patch will significantly improve fscrypt performance on Intel
> > servers that have QAT.
> 
> I was curious, so I actually ran a benchmark on an Intel Emerald Rapids server.
> Specifically, I used a kernel module that repeatedly en/decrypted 4096-byte
> messages with AES-XTS using crypto_skcipher_en/decrypt().  That's basically what
> fscrypt's file contents encryption does, but here I just measured the raw crypto
> performance.  I tested both xts-aes-vaes-avx512 and qat_aes_xts.  For both, the
> difference between encryption and decryption was within the margin of error, so
> I'll give just one number for each.
> 
> Results:
> 
>     xts-aes-vaes-avx512: 16171 MB/s
>     qat_aes_xts: 289 MB/s
> 
> So, QAT is 55 times slower than the VAES-optimized software code!
> 
> It's even slower than the generic C code:
>      
>     xts(ecb(aes-generic)): 305 MB/s
> 
> Now, it could be argued that this is user error -- I "should" have created lots
> of asynchronous crypto requests for 4K blocks, submitted them all at once, and
> waited for them to complete.  Thus allowing parallel processing by QAT.
> 
> But, that's simply not what fscrypt does.  And even if it did, it could only
> plausibly help for large bios.  Short bios, for which latency is really
> important, would continue to be massively regressed by using QAT for them.
> 
> Even for large bios, it would have to get over 55 times faster to be worth it,
> which seems (very?) tenuous.
> 
> Also, as is known from dm-crypt which does do async processing, the code that's
> needed to do it is quite complex and error-prone.
> 
> In any case, async processing would be a theoretical future improvement.  It's
> simply not what fscrypt does today, or has ever done.
> 
> I also found that, even though I built the QAT driver as a loadable module, it
> was loaded automatically on the system and prioritized itself over the VAES-
> accelerated AES-XTS.  Thus, it would be what fscrypt uses on Intel servers where
> the QAT driver is enabled in kconfig, even just as 'm'.
I just sent a patch to lower the priority of the skcipher (and aead)
algorithms in the QAT driver. This should allow xts-aes-vaes-avx512 to be
selected by default.

As for the module loading behaviour: loadable modules are automatically
loaded at startup if hardware that matches the device IDs they support
is found.

Regards,

-- 
Giovanni

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ