[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1934772.1739126247@warthog.procyon.org.uk>
Date: Sun, 09 Feb 2025 18:37:27 +0000
From: David Howells <dhowells@...hat.com>
To: Eric Biggers <ebiggers@...nel.org>
Cc: dhowells@...hat.com, netdev@...r.kernel.org,
Herbert Xu <herbert@...dor.apana.org.au>,
Marc Dionne <marc.dionne@...istor.com>,
Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Trond Myklebust <trond.myklebust@...merspace.com>,
Chuck Lever <chuck.lever@...cle.com>,
Ard Biesheuvel <ardb@...nel.org>,
"Cabiddu, Giovanni" <giovanni.cabiddu@...el.com>,
qat-linux <qat-linux@...el.com>, linux-crypto@...r.kernel.org,
linux-afs@...ts.infradead.org, linux-nfs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 03/24] crypto: Add 'krb5enc' hash and cipher AEAD algorithm
Eric Biggers <ebiggers@...nel.org> wrote:
> Linux's "cts" is specifically the CS3 variant of CTS (using the terminology
> of NIST SP800-38A https://dl.acm.org/doi/pdf/10.5555/2206248) which
> unconditionally swaps the last two blocks. Is that the variant that is
> needed here?
It's what NFS/SunRPC does and what works with the AuriStor YFS/AFS RxGK
implementation, so I presume so.
> SP800-38A mentions that CS3 is the variant used in Kerberos 5,
> so I assume yes. If yes, then you need to use cts(cbc(aes))
> unconditionally. (BTW, I hope you have some test that shows that you
> actually implemented the Kerberos protocol correctly?)
Depends what you mean by "the Kerberos protocol", I suppose. I took the
kerberos implementation from net/sunrpc/ and genericised it a bit so that I
could also use it for net/rxrpc/ and added AES+SHA2 and Camellia. It doesn't
use the Kerberos communications protocol per se, just the encryption formats.
To test this, I added test vectors for to crypto/testmgr.h and gave the krb5
lib its own selftests since those can do more comprehensive testing than the
testmgr. Note that I didn't find test vectors for AES+SHA1 that I could use,
so I haven't added those. I could generate some, by printing samples
generated by my code - but that's kind of circular:-/
On top of that, I've tested the code by running xfstests, git checkouts and
kernel builds against an AuriStor YFS server with an RxGK key - so it at least
agrees with that server's expectations.
> x86_64 already has an AES-NI assembly optimized cts(cbc(aes)), as you
> mentioned. I will probably add a VAES optimized cts(cbc(aes)) at some
> point; I've just been doing other modes first.
One of the issues I have with doing it on the CPU is that you have to do two
operations and, currently, they're done synchronously and serially.
Can you implement "auth5enc(hmac(sha256),cts(cbc(aes)))" in assembly and
actually make the assembly do both the AES and SHA at the same time? It looks
like it *might* be possible - but that you might be an XMM register short of
being able to do it:-/
> I don't see why off-CPU hardware offload support should deserve much
> attention here, given the extremely high speed of on-CPU crypto these days
> and the great difficulty of integrating off-CPU acceleration efficiently.
> In particular it seems weird to consider Intel QAT a reasonable thing to use
> over VAES.
Because some modern CPUs come with on-die crypto offload - and that can do
hash+encrypt or encrypt+hash in parallel. Now, there are a couple of issues
with using the QAT here:
(1) It doesn't support CTS. This means we'd have to impose the CTS from
above - and that may well make it unusable in doing hash + encrypt
simultaneously.
(2) It really needs batching to make it cheap enough to use. This might
actually be less of a problem - at least for rxgk. The data is split up
into fixed-size packets, but for a large amount of data we can end up
filling packets faster than we can transmit them. This offers the
opportunity to batch them - up to ~8192 packets in a single batch.
For NFS, things are a bit different. Because that mostly uses a streaming
transport these days, it wants to prepare a single huge message in one go -
and being able to parallellise the encrypt and the hash could be a benefit.
David
Powered by blists - more mailing lists