lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 11 Jun 2024 13:32:09 -0700
From: Eric Biggers <ebiggers@...nel.org>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: Ard Biesheuvel <ardb@...nel.org>,
	Steffen Klassert <steffen.klassert@...unet.com>,
	netdev@...r.kernel.org, linux-crypto@...r.kernel.org,
	fsverity@...ts.linux.dev, dm-devel@...ts.linux.dev, x86@...nel.org,
	linux-arm-kernel@...ts.infradead.org,
	Sami Tolvanen <samitolvanen@...gle.com>,
	Bart Van Assche <bvanassche@....org>,
	Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: [PATCH v4 6/8] fsverity: improve performance by using
 multibuffer hashing

On Tue, Jun 11, 2024 at 11:39:08PM +0800, Herbert Xu wrote:
> On Tue, Jun 11, 2024 at 11:21:43PM +0800, Herbert Xu wrote:
> >
> > Therefore if we switched to a linked-list API networking could
> > give us the buffers with minimal changes.
> 
> BTW, this is not just about parallelising hashing.  Just as one of
> the most significant benefits of GSO does not come from hardware
> offload, but rather the amortisation of (network) stack overhead.
> IOW you're traversing a very deep stack once instead of 40 times
> (this is the factor for 64K vs MTU, if we extend beyond 64K (which
> we absolute should do) the benefit would increase as well).
> 
> The same should apply to the Crypto API.  So even if this was a
> purely software solution with no assembly code at all, it may well
> improve GCM performance (at least for users able to feed us bulk
> data, like networking).
> 

At best this would save an indirect call per message, if the underlying
algorithm explicitly added support for it and the user of the API migrated to
the multi-request model.  This alone doesn't seem worth the effort of migrating
to multi-request, especially considering the many other already-possible
optimizations that would not require API changes or migrating users to
multi-request.  The x86_64 AES-GCM is pretty well optimized now after my recent
patches, but there's still an indirect call associated with the use of the SIMD
helper which could be eliminated, saving one per message (already as much as we
could hope to get from multi-request).  authenc on the other hand is almost
totally unoptimized, as I mentioned before; it makes little sense to talk about
any sort of multi-request optimization for it at this point.

- Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ