[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CDBFCA0-56B9-495B-9660-3BE9018BC8AE@zhaoxin.com>
Date: Tue, 22 Dec 2020 11:01:39 +0800
From: <tonywwang-oc@...oxin.com>
To: <hpa@...or.com>, Eric Biggers <ebiggers@...nel.org>
CC: <herbert@...dor.apana.org.au>, <davem@...emloft.net>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<x86@...nel.org>, <linux-crypto@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <TimGuo-oc@...oxin.com>,
<CooperYan@...oxin.com>, <QiyuanWang@...oxin.com>,
<HerryYang@...oxin.com>, <CobeChen@...oxin.com>,
<SilviaZhao@...oxin.com>
Subject: Re: [PATCH] crypto: x86/crc32c-intel - Don't match some Zhaoxin CPUs
On December 22, 2020 3:27:33 AM GMT+08:00, hpa@...or.com wrote:
>On December 20, 2020 6:46:25 PM PST, tonywwang-oc@...oxin.com wrote:
>>On December 16, 2020 1:56:45 AM GMT+08:00, Eric Biggers
>><ebiggers@...nel.org> wrote:
>>>On Tue, Dec 15, 2020 at 10:15:29AM +0800, Tony W Wang-oc wrote:
>>>>
>>>> On 15/12/2020 04:41, Eric Biggers wrote:
>>>> > On Mon, Dec 14, 2020 at 10:28:19AM +0800, Tony W Wang-oc wrote:
>>>> >> On 12/12/2020 01:43, Eric Biggers wrote:
>>>> >>> On Fri, Dec 11, 2020 at 07:29:04PM +0800, Tony W Wang-oc wrote:
>>>> >>>> The driver crc32c-intel match CPUs supporting
>>>X86_FEATURE_XMM4_2.
>>>> >>>> On platforms with Zhaoxin CPUs supporting this X86 feature,
>>When
>>>> >>>> crc32c-intel and crc32c-generic are both registered, system
>>will
>>>> >>>> use crc32c-intel because its .cra_priority is greater than
>>>> >>>> crc32c-generic. This case expect to use crc32c-generic driver
>>>for
>>>> >>>> some Zhaoxin CPUs to get performance gain, So remove these
>>>Zhaoxin
>>>> >>>> CPUs support from crc32c-intel.
>>>> >>>>
>>>> >>>> Signed-off-by: Tony W Wang-oc <TonyWWang-oc@...oxin.com>
>>>> >>>
>>>> >>> Does this mean that the performance of the crc32c instruction
>on
>>>those CPUs is
>>>> >>> actually slower than a regular C implementation? That's very
>>>weird.
>>>> >>>
>>>> >>
>>>> >> From the lmbench3 Create and Delete file test on those chips, I
>>>think yes.
>>>> >>
>>>> >
>>>> > Did you try measuring the performance of the hashing itself, and
>>>not some
>>>> > higher-level filesystem operations?
>>>> >
>>>>
>>>> Yes. Was testing on these Zhaoxin CPUs, the result is that with the
>>>same
>>>> input value the generic C implementation takes fewer time than the
>>>> crc32c instruction implementation.
>>>>
>>>
>>>And that is really "working as intended"?
>>
>>These CPU's crc32c instruction is not working as intended.
>>
>> Why do these CPUs even
>>>declare that
>>>they support the crc32c instruction, when it is so slow?
>>>
>>
>>The presence of crc32c and some other instructions supports are
>>enumerated by CPUID.01:ECX[SSE4.2] = 1, other instructions are ok
>>except the crc32c instruction.
>>
>>>Are there any other instruction sets (AES-NI, PCLMUL, SSE, SSE2, AVX,
>>>etc.) that
>>>these CPUs similarly declare support for but they are uselessly slow?
>>
>>No.
>>
>>Sincerely
>>Tonyw
>>
>>>
>>>- Eric
>
>Then the right thing to do is to disable the CPUID bit in the
>vendor-specific startup code.
This way makes these CPUs do not support all instruction sets enumerated
by CPUID.01:ECX[SSE4.2].
While only crc32c instruction is slow, just expect the crc32c-intel driver do not
match these CPUs.
Sincerely
Tonyw
Powered by blists - more mailing lists