lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJkfWY524YtCXebG1Nj1=wGjGJpwRMCyOQyXk4fpg3BvPM17zw@mail.gmail.com>
Date:   Thu, 11 May 2023 12:02:00 -0700
From:   Nathan Huckleberry <nhuck@...gle.com>
To:     Heiko Stübner <heiko@...ech.de>
Cc:     palmer@...belt.com, paul.walmsley@...ive.com,
        aou@...s.berkeley.edu, herbert@...dor.apana.org.au,
        davem@...emloft.net, conor.dooley@...rochip.com,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-crypto@...r.kernel.org, christoph.muellner@...ll.eu
Subject: Re: [PATCH v4 4/4] RISC-V: crypto: add accelerated GCM GHASH implementation

On Thu, May 11, 2023 at 3:30 AM Heiko Stübner <heiko@...ech.de> wrote:
>
> Hi Nathan,
>
> Am Dienstag, 11. April 2023, 17:00:00 CEST schrieb Nathan Huckleberry:
> > On Wed, Mar 29, 2023 at 7:08 AM Heiko Stuebner <heiko@...ech.de> wrote:
> > > +struct riscv64_ghash_ctx {
> > > +       void (*ghash_func)(u64 Xi[2], const u128 Htable[16],
> > > +                          const u8 *inp, size_t len);
> > > +
> > > +       /* key used by vector asm */
> > > +       u128 htable[16];
> >
> > This field looks too big. The assembly only loads the first 128-byte
> > value from this table.
>
> OpenSSL defines the Htable field handed to the init- and the other
> functions as this "u128 Htable[16]"    [0] . As I really like the concept
> of keeping in sync with openSSL, I guess I'd rather not change that.
>
> [0] https://github.com/openssl/openssl/blob/master/crypto/modes/gcm128.c#L88
>
>
> > Is this copied from another implementation? There's an optimization
> > where you precompute the first N powers of H so that you can perform 1
> > finite field reduction for every N multiplications, but it doesn't
> > look like that's being used here.
>
> The whole crypto-specific code comes from openSSL itself, so for now I
> guess I'd like to try keeping things the same.
>
>
> > > +#define RISCV64_ZBC_SETKEY(VARIANT, GHASH)                             \
> > > +void gcm_init_rv64i_ ## VARIANT(u128 Htable[16], const u64 Xi[2]);     \
> > > +static int riscv64_zbc_ghash_setkey_ ## VARIANT(struct crypto_shash *tfm,      \
> > > +                                          const u8 *key,               \
> > > +                                          unsigned int keylen)         \
> > > +{                                                                      \
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(tfm)); \
> > > +       const u64 k[2] = { cpu_to_be64(((const u64 *)key)[0]),          \
> > > +                          cpu_to_be64(((const u64 *)key)[1]) };        \
> > > +                                                                       \
> > > +       if (keylen != GHASH_BLOCK_SIZE)                                 \
> > > +               return -EINVAL;                                         \
> > > +                                                                       \
> > > +       memcpy(&ctx->key, key, GHASH_BLOCK_SIZE);                       \
> > > +       gcm_init_rv64i_ ## VARIANT(ctx->htable, k);                     \
> > > +                                                                       \
> > > +       ctx->ghash_func = gcm_ghash_rv64i_ ## GHASH;                    \
> > > +                                                                       \
> > > +       return 0;                                                       \
> > > +}
> >
> > I'd prefer three identical functions over a macro here. Code searching
> > tools and compiler warnings are significantly worse with macros.
>
> done :-)
>
>
> > > +
> > > +static int riscv64_zbc_ghash_update(struct shash_desc *desc,
> > > +                          const u8 *src, unsigned int srclen)
> > > +{
> > > +       unsigned int len;
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > > +
> > > +       if (dctx->bytes) {
> > > +               if (dctx->bytes + srclen < GHASH_DIGEST_SIZE) {
> > > +                       memcpy(dctx->buffer + dctx->bytes, src,
> > > +                               srclen);
> > > +                       dctx->bytes += srclen;
> > > +                       return 0;
> > > +               }
> > > +               memcpy(dctx->buffer + dctx->bytes, src,
> > > +                       GHASH_DIGEST_SIZE - dctx->bytes);
> > > +
> > > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> > > +
> > > +               src += GHASH_DIGEST_SIZE - dctx->bytes;
> > > +               srclen -= GHASH_DIGEST_SIZE - dctx->bytes;
> > > +               dctx->bytes = 0;
> > > +       }
> > > +       len = srclen & ~(GHASH_DIGEST_SIZE - 1);
> > > +
> > > +       if (len) {
> > > +               gcm_ghash_rv64i_zbc(dctx->shash, ctx->htable,
> > > +                               src, len);
> > > +               src += len;
> > > +               srclen -= len;
> > > +       }
> > > +
> > > +       if (srclen) {
> > > +               memcpy(dctx->buffer, src, srclen);
> > > +               dctx->bytes = srclen;
> > > +       }
> > > +       return 0;
> > > +}
> > > +
> > > +static int riscv64_zbc_ghash_final(struct shash_desc *desc, u8 *out)
> > > +{
> > > +       int i;
> > > +       struct riscv64_ghash_ctx *ctx = crypto_tfm_ctx(crypto_shash_tfm(desc->tfm));
> > > +       struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc);
> > > +
> > > +       if (dctx->bytes) {
> > > +               for (i = dctx->bytes; i < GHASH_DIGEST_SIZE; i++)
> > > +                       dctx->buffer[i] = 0;
> > > +               ctx->ghash_func(dctx->shash, ctx->htable,
> > > +                               dctx->buffer, GHASH_DIGEST_SIZE);
> >
> > Can we do this without an indirect call?
>
> hmm, the indirect call is in both riscv64_zbc_ghash_update() and
> riscv64_zbc_ghash_final() . And I found a missing one at the bottom
> of riscv64_zbc_ghash_update(), where gcm_ghash_rv64i_zbc() is
> called right now.
>
> Getting rid of the indirect call would mean duplicating both of these
> functions for all instances. Especially with the slightly higher
> complexity of the update this somehow seems not the best way to go.

Indirect calls are quite expensive. They are an issue for things like
disk/filesystem encryption because it introduces a ton of latency per
block.

I think this is a candidate for static calls. It looks like static
call support hasn't been accepted for riscv yet. Maybe just add a TODO
for now?

See:
https://lwn.net/Articles/771209/
https://lore.kernel.org/all/tencent_A8A256967B654625AEE1DB222514B0613B07@qq.com/

>
>
> Thanks for your pointers
> Heiko
>
>

Thanks,
Huck

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ