linux-kernel - Re: [PATCH v8 0/7] Optimize dm-verity and fsverity using multibuffer hashing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250214033518.GA2771@sol.localdomain>
Date: Thu, 13 Feb 2025 19:35:18 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: fsverity@...ts.linux.dev, linux-crypto@...r.kernel.org,
	dm-devel@...ts.linux.dev, x86@...nel.org,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	Ard Biesheuvel <ardb@...nel.org>,
	Sami Tolvanen <samitolvanen@...gle.com>,
	Alasdair Kergon <agk@...hat.com>, Mike Snitzer <snitzer@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Mikulas Patocka <mpatocka@...hat.com>,
	David Howells <dhowells@...hat.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v8 0/7] Optimize dm-verity and fsverity using multibuffer
 hashing

On Fri, Feb 14, 2025 at 10:44:47AM +0800, Herbert Xu wrote:
> On Wed, Feb 12, 2025 at 10:33:04PM -0800, Eric Biggers wrote:
> > 
> > I've already covered this extensively, but here we go again.  First there are
> > more users of shash than ahash in the kernel, since shash is much easier to use
> > and also a bit faster.  There is nothing storage specific about it.  You've
> > claimed that shash is deprecated, but that reflects a misunderstanding of what
> > users actually want and need.  Users want simple, fast, easy-to-use APIs.  Not
> > APIs that are optimized for an obsolete form of hardware offload and have
> > CPU-based crypto support bolted on as an afterthought.
> 
> The ahash interface was not designed for hardware offload, it's
> exactly the same as the skcipher interface which caters for all
> users.  The shash interface was a mistake, one which I've only
> come to realise after adding the corresponding lskcipher interface.

It absolutely is designed for an obsolete form of hardware offload.  Have you
ever tried actually using it?  Here's how to hash a buffer of data with shash:

	return crypto_shash_tfm_digest(tfm, data, size, out)

... and here's how to do it with the SHA-256 library, for what it's worth:

	sha256(data, size, out)

and here's how to do it with ahash:

	struct ahash_request *req;
	struct scatterlist sg;
	DECLARE_CRYPTO_WAIT(wait);
	int err;

	req = ahash_request_alloc(alg, GFP_KERNEL);
	if (!req)
		return -ENOMEM;

	sg_init_one(&sg, data, size);
	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP |
					CRYPTO_TFM_REQ_MAY_BACKLOG,
				   crypto_req_done, &wait);
	ahash_request_set_crypt(req, &sg, out, size);

	err = crypto_wait_req(crypto_ahash_digest(req), &wait);

	ahash_request_free(req);
	return err;

Hmm, I wonder which API users would rather use?

The extra complexity is from supporting an obsolete form of hardware offload.

Yes, skcipher and aead have the same problem, but that doesn't mean it is right.

> > Second, these days TLS and IPsec usually use AES-GCM, which is inherently
> > parallelizable so does not benefit from multibuffer crypto.  This is a major
> > difference between the AEADs and message digest algorithms in common use.  And
> > it happens that I recently did a lot of work to optimize AES-GCM on x86_64; see
> > my commits in v6.11 that made AES-GCM 2-3x as fast on VAES-capable CPUs.
> 
> Bravo to your efforts on improving GCM.  But that does not mean that
> GCM is not amenable to parallel processing.  While CTR itself is
> obviously already parallel, the GHASH algorithm can indeed benefit
> from parallel processing like any other hashing algorithm.

What?  GHASH is a polynomial hash function, so it is easily parallelizable.  If
you precompute N powers of the hash key then you can process N blocks in
parallel.  Check how the AES-GCM assembly code works; that's exactly what it
does.  This is fundamentally different from message digests like SHA-* where the
blocks have to be processed serially.

- Eric