lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 8 Jan 2021 09:21:55 +0100
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Eric Biggers <ebiggers@...nel.org>
Cc:     Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Mark Rutland <mark.rutland@....com>,
        Arnd Bergmann <arnd@...nel.org>,
        "Theodore Ts'o" <tytso@....edu>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        linux-toolchains@...r.kernel.org,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        Will Deacon <will@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: Aarch64 EXT4FS inode checksum failures - seems to be weak memory
 ordering issues

On Thu, 7 Jan 2021 at 23:42, Eric Biggers <ebiggers@...nel.org> wrote:
>
> On Thu, Jan 07, 2021 at 10:14:46PM +0000, Russell King - ARM Linux admin wrote:
> > On Thu, Jan 07, 2021 at 10:48:05PM +0100, Arnd Bergmann wrote:
> > > On Thu, Jan 7, 2021 at 5:27 PM Theodore Ts'o <tytso@....edu> wrote:
> > > >
> > > > On Thu, Jan 07, 2021 at 01:37:47PM +0000, Russell King - ARM Linux admin wrote:
> > > > > > The gcc bugzilla mentions backports into gcc-linaro, but I do not see
> > > > > > them in my git history.
> > > > >
> > > > > So, do we raise the minimum gcc version for the kernel as a whole to 5.1
> > > > > or just for aarch64?
> > > >
> > > > Russell, Arnd, thanks so much for tracking down the root cause of the
> > > > bug!
> > >
> > > There is one more thing that I wondered about when looking through
> > > the ext4 code: Should it just call the crc32c_le() function directly
> > > instead of going through the crypto layer? It seems that with Ard's
> > > rework from 2018, that can just call the underlying architecture specific
> > > implementation anyway.
> >
> > Yes, I've been wondering about that too. To me, it looks like the
> > ext4 code performs a layering violation by going "under the covers"
> > - there are accessor functions to set the CRC and retrieve it. ext4
> > instead just makes the assumption that the CRC value is stored after
> > struct shash_desc. Especially as the crypto/crc32c code references
> > the value using:
> >
> >       struct chksum_desc_ctx *ctx = shash_desc_ctx(desc);
> >
> > Not even crypto drivers are allowed to assume that desc+1 is where
> > the CRC is stored.
>
> It violates how the shash API is meant to be used in general, but there is a
> test that enforces that the shash_desc_ctx for crc32c must be just the single
> u32 crc value.  See alg_test_crc32c() in crypto/testmgr.c.  So it's apparently
> intended to work.
>
> >
> > However, struct shash_desc is already 128 bytes in size on aarch64,
>
> Ard Biesheuvel recently sent a patch to reduce the alignment of struct
> shash_desc to ARCH_SLAB_MINALIGN
> (https://lkml.kernel.org/linux-crypto/20210107124128.19791-1-ardb@kernel.org/),
> since apparently most of the bloat is from alignment for DMA, which isn't
> necessary.  I think that reduces the size by a lot on arm64.
>
> > and the proper way of doing it via SHASH_DESC_ON_STACK() is overkill,
> > being strangely 2 * sizeof(struct shash_desc) + 360 (which looks like
> > another bug to me!)
>
> Are you referring to the '2 * sizeof(struct shash_desc)' rather than just
> 'sizeof(struct shash_desc)'?  As mentioned in the comment above
> HASH_MAX_DESCSIZE, there can be a nested shash_desc due to HMAC.
> So I believe the value is correct.
>
> > So, I agree with you wrt crc32c_le(), especially as it would be more
> > efficient, and as the use of crc32c is already hard coded in the ext4
> > code - not only with crypto_alloc_shash("crc32c", 0, 0) but also with
> > the fixed-size structure in ext4_chksum().
> >
> > However, it's ultimately up to the ext4 maintainers to decide.
>
> As I mentioned in my other response, crc32c_le() isn't a proper library API
> (like some of the newer lib/crypto/ stuff) but rather just a wrapper for the
> shash API, and it doesn't handle modules being dynamically loaded/unloaded.
> So switching to it may cause a performance regression.
>
> What I'd recommend is making crc32c_le() able to call architecture-speccific
> implementations directly, similar to blake2s() and chacha20() in lib/crypto/.
> Then there would be no concern about when modules get loaded, etc...
>

I have looked into this in the past, both for crc32(c) and crc-t10dif,
both of which use a horrid method of wrapping a shash into a library
API. This was before we had static calls, though, and this work kind
of stalled on that. It should be straight-forward to redefine the
crc32c() library function as a static call, and have an optimized
module (or builtin) perform the [conditional] static call update at
module_init() time. The only missing piece is autoloading such
modules, which is tricky with softdeps if the dependency is from the
core kernel.

Currently, we have many users of crc32(c) in the kernel that go via
the shash (or synchronous ahash) layer to perform crc32c, all of which
would be better served by a library API, given that the hash type is a
compile time constant, and only synchronous calls are made.




drivers/infiniband/hw/i40iw/i40iw_utils.c: tfm =
crypto_alloc_shash("crc32c", 0, 0);
drivers/infiniband/sw/rxe/rxe_verbs.c: tfm = crypto_alloc_shash("crc32", 0, 0);
drivers/infiniband/sw/siw/siw_main.c: siw_crypto_shash =
crypto_alloc_shash("crc32c", 0, 0);
drivers/md/dm-crypt.c: tcw->crc32_tfm = crypto_alloc_shash("crc32", 0,
drivers/nvme/host/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/nvme/target/tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/scsi/iscsi_tcp.c: tfm = crypto_alloc_ahash("crc32c", 0,
CRYPTO_ALG_ASYNC);
drivers/target/iscsi/iscsi_target_login.c: tfm =
crypto_alloc_ahash("crc32c", 0, CRYPTO_ALG_ASYNC);
fs/ext4/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32c", 0, 0);
fs/f2fs/super.c: sbi->s_chksum_driver = crypto_alloc_shash("crc32", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
fs/jbd2/journal.c: journal->j_chksum_driver =
crypto_alloc_shash("crc32c", 0, 0);
lib/libcrc32c.c: tfm = crypto_alloc_shash("crc32c", 0, 0);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ