lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 11 May 2023 12:30:41 +0200
From:   Heiko Stübner <heiko@...ech.de>
To:     Nathan Huckleberry <nhuck@...gle.com>
Cc:     palmer@...belt.com, paul.walmsley@...ive.com,
        aou@...s.berkeley.edu, herbert@...dor.apana.org.au,
        davem@...emloft.net, conor.dooley@...rochip.com,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-crypto@...r.kernel.org, christoph.muellner@...ll.eu
Subject: Re: [PATCH v4 4/4] RISC-V: crypto: add accelerated GCM GHASH implementation

Hi Nathan,

Am Dienstag, 11. April 2023, 17:00:00 CEST schrieb Nathan Huckleberry:
> On Wed, Mar 29, 2023 at 7:08 AM Heiko Stuebner <heiko@...ech.de> wrote:
> > +struct riscv64_ghash_ctx {
> > +       void (*ghash_func)(u64 Xi[2], const u128 Htable[16],
> > +                          const u8 *inp, size_t len);
> > +
> > +       /* key used by vector asm */
> > +       u128 htable[16];
> 
> This field looks too big. The assembly only loads the first 128-byte
> value from this table.

OpenSSL defines the Htable field handed to the init- and the other
functions as this "u128 Htable[16]"    [0] . As I really like the concept
of keeping in sync with openSSL, I guess I'd rather not change that.

[0] https://github.com/openssl/openssl/blob/master/crypto/modes/gcm128.c#L88


> Is this copied from another implementation? There's an optimization
> where you precompute the first N powers of H so that you can perform 1
> finite field reduction for every N multiplications, but it doesn't
> look like that's being used here.

The whole crypto-specific code comes from openSSL itself, so for now I
guess I'd like to try keeping things the same.


> > +#define RISCV64_ZBC_SETKEY(VARIANT, GHASH)                             \
> > +void gcm_init_rv64i_ ## VARIANT(u128 Htable[16], const u64 Xi[2]);     \
> > +static int riscv64_zbc_ghash_setkey_ ## VARIANT(struct crypto_shash *tfm,      \
> > +                                          const u8 *key,               \
> > +                                          unsigned int keylen)         \
> > +{                                                                      \
> > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(tfm)); \
> > +       const u64 k[2] = { cpu_to_be64(((const u64 *)key)[0]),          \
> > +                          cpu_to_be64(((const u64 *)key)[1]) };        \
> > +                                                                       \
> > +       if (keylen != GHASH_BLOCK_SIZE)                                 \
> > +               return -EINVAL;                                         \
> > +                                                                       \
> > +       memcpy(&ctx->key, key, GHASH_BLOCK_SIZE);                       \
> > +       gcm_init_rv64i_ ## VARIANT(ctx->htable, k);                     \
> > +                                                                       \
> > +       ctx->ghash_func = gcm_ghash_rv64i_ ## GHASH;                    \
> > +                                                                       \
> > +       return 0;                                                       \
> > +}
> 
> I'd prefer three identical functions over a macro here. Code searching
> tools and compiler warnings are significantly worse with macros.

done :-)


> > +
> > +static int riscv64_zbc_ghash_update(struct shash_desc *desc,
> > +                          const u8 *src, unsigned int srclen)
> > +{
> > +       unsigned int len;
> > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > +
> > +       if (dctx->bytes) {
> > +               if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) {
> > +                       memcpy(dctx->buffer + dctx->bytes, src,
> > +                               srclen);
> > +                       dctx->bytes += srclen;
> > +                       return 0;
> > +               }
> > +               memcpy(dctx->buffer + dctx->bytes, src,
> > +                       GHASH_DIGEST_SIZE - dctx->bytes);
> > +
> > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> > +
> > +               src += GHASH_DIGEST_SIZE - dctx->bytes;
> > +               srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
> > +               dctx->bytes = 0;
> > +       }
> > +       len = srclen & ~(GHASH_DIGEST_SIZE - 1);
> > +
> > +       if (len) {
> > +               gcm_ghash_rv64i_zbc(dctx->shash, ctx->htable,
> > +                               src, len);
> > +               src += len;
> > +               srclen -= len;
> > +       }
> > +
> > +       if (srclen) {
> > +               memcpy(dctx->buffer, src, srclen);
> > +               dctx->bytes = srclen;
> > +       }
> > +       return 0;
> > +}
> > +
> > +static int riscv64_zbc_ghash_final(struct shash_desc *desc, u8 *out)
> > +{
> > +       int i;
> > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > +
> > +       if (dctx->bytes) {
> > +               for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
> > +                       dctx->buffer[i] = 0;
> > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> 
> Can we do this without an indirect call?

hmm, the indirect call is in both riscv64_zbc_ghash_update() and
riscv64_zbc_ghash_final() . And I found a missing one at the bottom
of riscv64_zbc_ghash_update(), where gcm_ghash_rv64i_zbc() is
called right now.

Getting rid of the indirect call would mean duplicating both of these
functions for all instances. Especially with the slightly higher
complexity of the update this somehow seems not the best way to go.


Thanks for your pointers
Heiko


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ