[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250209190525.GA6017@sol.localdomain>
Date: Sun, 9 Feb 2025 11:05:25 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: David Howells <dhowells@...hat.com>
Cc: netdev@...r.kernel.org, Herbert Xu <herbert@...dor.apana.org.au>,
Marc Dionne <marc.dionne@...istor.com>,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Trond Myklebust <trond.myklebust@...merspace.com>,
Chuck Lever <chuck.lever@...cle.com>,
Ard Biesheuvel <ardb@...nel.org>,
"Cabiddu, Giovanni" <giovanni.cabiddu@...el.com>,
qat-linux <qat-linux@...el.com>, linux-crypto@...r.kernel.org,
linux-afs@...ts.infradead.org, linux-nfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 03/24] crypto: Add 'krb5enc' hash and cipher AEAD
algorithm
On Sun, Feb 09, 2025 at 06:37:27PM +0000, David Howells wrote:
> One of the issues I have with doing it on the CPU is that you have to do two
> operations and, currently, they're done synchronously and serially.
>
> Can you implement "auth5enc(hmac(sha256),cts(cbc(aes)))" in assembly and
> actually make the assembly do both the AES and SHA at the same time? It looks
> like it *might* be possible - but that you might be an XMM register short of
> being able to do it:-/
Yes, that would be the proper way to optimize that algorithm. Someone just
needs to do it. (And presumably you want this one and not Camellia which you
are also pushing for some reason?)
> > I don't see why off-CPU hardware offload support should deserve much
> > attention here, given the extremely high speed of on-CPU crypto these days
> > and the great difficulty of integrating off-CPU acceleration efficiently.
> > In particular it seems weird to consider Intel QAT a reasonable thing to use
> > over VAES.
>
> Because some modern CPUs come with on-die crypto offload - and that can do
> hash+encrypt or encrypt+hash in parallel. Now, there are a couple of issues
> with using the QAT here:
>
> (1) It doesn't support CTS. This means we'd have to impose the CTS from
> above - and that may well make it unusable in doing hash + encrypt
> simultaneously.
>
> (2) It really needs batching to make it cheap enough to use. This might
> actually be less of a problem - at least for rxgk. The data is split up
> into fixed-size packets, but for a large amount of data we can end up
> filling packets faster than we can transmit them. This offers the
> opportunity to batch them - up to ~8192 packets in a single batch.
>
> For NFS, things are a bit different. Because that mostly uses a streaming
> transport these days, it wants to prepare a single huge message in one go -
> and being able to parallellise the encrypt and the hash could be a benefit.
Right, the batching is always a huge issue for those types of accelerators. A
much more promising approach is to just fully take advantage of the CPU
instructions that already accelerate the same algorithms very well.
- Eric
Powered by blists - more mailing lists