lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SN1PR04MB18241CBDC8E4CA5275EE4424EA360@SN1PR04MB1824.namprd04.prod.outlook.com>
Date:   Fri, 24 Aug 2018 15:32:52 +0000
From:   Jeffrey Lien <Jeff.Lien@....com>
To:     Christoph Hellwig <hch@...radead.org>,
        "Martin K. Petersen" <martin.petersen@...cle.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
        "herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
        "tim.c.chen@...ux.intel.com" <tim.c.chen@...ux.intel.com>,
        David Darrington <david.darrington@....com>,
        Jeff Furlong <jeff.furlong@....com>
Subject: RE: [PATCH] Performance Improvement in CRC16 Calculations.

I rebuilt my 4.18 kernel with CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y as Martin recommended and got even better performance results vs the CRC Slice by 16 changes.  Here's a summary of the results

FIO Sequential Write, 64K Block Size, Queue Depth 64
PCLMUL = y Kernel:        bw = 2237 MiB/s
Slice by 16 CRC Calc:      bw = 1964 MiB/s
Base Kernel:                     bw =   357 MiB/s

FIO Sequential Read, 64K Block Size, Queue Depth 64
PCLMUL = y Kernel:        bw = 3839 MiB/s
Slice by 16 CRC Calc:      bw = 2730  MiB/s
Base Kernel:                     bw =   797 MiB/s

So it seems the CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y provides the best performance.  Are there any negative side effect to this config option?   If not, does it make sense to recommend all the major distro's change their config options to have CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y as the default option?  


Jeff Lien


-----Original Message-----
From: Christoph Hellwig [mailto:hch@...radead.org] 
Sent: Wednesday, August 22, 2018 1:20 AM
To: Martin K. Petersen <martin.petersen@...cle.com>
Cc: Jeffrey Lien <Jeff.Lien@....com>; linux-kernel@...r.kernel.org; linux-crypto@...r.kernel.org; linux-block@...r.kernel.org; linux-scsi@...r.kernel.org; herbert@...dor.apana.org.au; tim.c.chen@...ux.intel.com; David Darrington <david.darrington@....com>; Jeff Furlong <jeff.furlong@....com>
Subject: Re: [PATCH] Performance Improvement in CRC16 Calculations.

On Tue, Aug 21, 2018 at 09:40:34PM -0400, Martin K. Petersen wrote:
> When crc-t10dif is initialized, the crypto infrastructure will pick 
> the algorithm with the highest priority currently registered. Both 
> block and SCSI will cause crc-t10dif to be compiled as a built-in so 
> this selection happens very early.

Ouch.  This might actually happen in a lot of other users of the crypto functionality as well.

> However, it seems like a bit of a deficiency in crypto that there is 
> no way to upgrade existing transformations if higher priority 
> algorithms become available. btrfs and a few others work around this 
> issue by not using the generic lib/ CRC functions (which defeats the 
> purpose of having these in the first place). Instead they are 
> registering their own transformation at a later time where any 
> accelerator modules are more likely to be loaded.

If we can't fix this in crypto (which doesn't seem that easy), we should at least clearly document the issue somewhere, and fix this in the t10pi code by initializing crct10dif_tfm in a lazy fashion only once the fist block device starts using it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ