linux-kernel - Re: [PATCH v4] crc32c: Implement CRC32c with slicing-by-8 algorithm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <OF54094699.6F5CF2B6-ONC125791C.004C901C-C125791C.004D1A61@transmode.se>
Date:	Sat, 1 Oct 2011 16:02:10 +0200
From:	Joakim Tjernlund <joakim.tjernlund@...nsmode.se>
To:	"Darrick J. Wong" <djwong@...ibm.com>
Cc:	Andreas Dilger <adilger.kernel@...ger.ca>,
	Mingming Cao <cmm@...ibm.com>,
	David Miller <davem@...emloft.net>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	linux-crypto <linux-crypto@...r.kernel.org>,
	linux-ext4@...r.kernel.org,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Bob Pearson <rpearson@...temfabricworks.com>,
	Theodore Tso <tytso@....edu>
Subject: Re: [PATCH v4] crc32c: Implement CRC32c with slicing-by-8 algorithm


"Darrick J. Wong" <djwong@...ibm.com> wrote on 2011/09/30 21:29:56:
>
> The existing CRC32c implementation uses Sarwate's algorithm to calculate the
> code one byte at a time.  Using a slicing-by-8 algorithm adapted from Bob
> Pearson, we can process buffers 8 bytes at a time, for a substantial increase
> in performance.
>
> The motivation for this patchset is that I am working on adding full metadata
> checksumming to ext4 and jbd2.  As far as performance impact of adding
> checksumming goes, I see nearly no change with a standard mail server ffsb
> simulation.  On a test that involves only metadata operations (file creation
> and deletion, and fallocate/truncate), I see a drop of about 50 pcercent with
> the current kernel crc32c implementation; this improves to a drop of about 20
> percent with the enclosed crc32c code.
>
> When metadata is usually a small fraction of total IO, this new implementation
> doesn't help much because metadata is usually a small fraction of total IO.
> However, when we are doing IO that is almost all metadata (such as rm -rf'ing a
> tree), then this patch speeds up the operation substantially.
>
> Given that iscsi, sctp, and btrfs also use crc32c, this patchset should improve
> their speed as well.  I have some preliminary results[1] that show the
> difference in various crc algorithms that I've come across: the "crc32c-by8-le"
> column is the new algorithm in the patch; the "crc32c" column is the current
> crc32c kernel implementation; and the "crc32-kern-le" column is the current
> crc32 kernel implementation, which is similar to the results one gets for
> CONFIG_CRC32C_SLICEBY4=y.  As you can see, the new implementation runs at
> nearly 4x the speed of the current implementation; even the slimmer slice-by-4
> implementation is generally 2-3x faster.
>
> However, the implementation allows the kernel builder to select from a variety
> of space-speed tradeoffs, should my results not hold true on a particular
> class of system.
>
> v2: Use the crypto testmgr api for self-test.
> v3: Get rid of the -be version, which had no users.
> v4: Allow kernel builder a choice of speed vs. space optimization.
>
> [1]http://djwong.org/docs/ext4_metadata_checksums.html
> (cached copy of the ext4 wiki)
>
> Signed-off-by: Darrick J. Wong <djwong@...ibm.com>

This is based on an old version of Bobs slice by 8 that has lots duplication and
hard to maintain.

Start from Bobs latest patches and add crc32c to lib/crc32.c

Also, for crc32c I think you only need slice by 4 and slice by 8

 Jocke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/