[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5804E9D1-15D2-41A9-A483-16985C9810FE@zytor.com>
Date: Mon, 21 Dec 2020 20:54:31 -0800
From: hpa@...or.com
To: tonywwang-oc@...oxin.com, Eric Biggers <ebiggers@...nel.org>
CC: herbert@...dor.apana.org.au, davem@...emloft.net,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, x86@...nel.org,
linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
TimGuo-oc@...oxin.com, CooperYan@...oxin.com,
QiyuanWang@...oxin.com, HerryYang@...oxin.com,
CobeChen@...oxin.com, SilviaZhao@...oxin.com
Subject: Re: [PATCH] crypto: x86/crc32c-intel - Don't match some Zhaoxin CPUs
On December 21, 2020 7:01:39 PM PST, tonywwang-oc@...oxin.com wrote:
>On December 22, 2020 3:27:33 AM GMT+08:00, hpa@...or.com wrote:
>>On December 20, 2020 6:46:25 PM PST, tonywwang-oc@...oxin.com wrote:
>>>On December 16, 2020 1:56:45 AM GMT+08:00, Eric Biggers
>>><ebiggers@...nel.org> wrote:
>>>>On Tue, Dec 15, 2020 at 10:15:29AM +0800, Tony W Wang-oc wrote:
>>>>>
>>>>> On 15/12/2020 04:41, Eric Biggers wrote:
>>>>> > On Mon, Dec 14, 2020 at 10:28:19AM +0800, Tony W Wang-oc wrote:
>>>>> >> On 12/12/2020 01:43, Eric Biggers wrote:
>>>>> >>> On Fri, Dec 11, 2020 at 07:29:04PM +0800, Tony W Wang-oc
>wrote:
>>>>> >>>> The driver crc32c-intel match CPUs supporting
>>>>X86_FEATURE_XMM4_2.
>>>>> >>>> On platforms with Zhaoxin CPUs supporting this X86 feature,
>>>When
>>>>> >>>> crc32c-intel and crc32c-generic are both registered, system
>>>will
>>>>> >>>> use crc32c-intel because its .cra_priority is greater than
>>>>> >>>> crc32c-generic. This case expect to use crc32c-generic driver
>>>>for
>>>>> >>>> some Zhaoxin CPUs to get performance gain, So remove these
>>>>Zhaoxin
>>>>> >>>> CPUs support from crc32c-intel.
>>>>> >>>>
>>>>> >>>> Signed-off-by: Tony W Wang-oc <TonyWWang-oc@...oxin.com>
>>>>> >>>
>>>>> >>> Does this mean that the performance of the crc32c instruction
>>on
>>>>those CPUs is
>>>>> >>> actually slower than a regular C implementation? That's very
>>>>weird.
>>>>> >>>
>>>>> >>
>>>>> >> From the lmbench3 Create and Delete file test on those chips, I
>>>>think yes.
>>>>> >>
>>>>> >
>>>>> > Did you try measuring the performance of the hashing itself, and
>>>>not some
>>>>> > higher-level filesystem operations?
>>>>> >
>>>>>
>>>>> Yes. Was testing on these Zhaoxin CPUs, the result is that with
>the
>>>>same
>>>>> input value the generic C implementation takes fewer time than the
>>>>> crc32c instruction implementation.
>>>>>
>>>>
>>>>And that is really "working as intended"?
>>>
>>>These CPU's crc32c instruction is not working as intended.
>>>
>>> Why do these CPUs even
>>>>declare that
>>>>they support the crc32c instruction, when it is so slow?
>>>>
>>>
>>>The presence of crc32c and some other instructions supports are
>>>enumerated by CPUID.01:ECX[SSE4.2] = 1, other instructions are ok
>>>except the crc32c instruction.
>>>
>>>>Are there any other instruction sets (AES-NI, PCLMUL, SSE, SSE2,
>AVX,
>>>>etc.) that
>>>>these CPUs similarly declare support for but they are uselessly
>slow?
>>>
>>>No.
>>>
>>>Sincerely
>>>Tonyw
>>>
>>>>
>>>>- Eric
>>
>>Then the right thing to do is to disable the CPUID bit in the
>>vendor-specific startup code.
>
>This way makes these CPUs do not support all instruction sets
>enumerated
>by CPUID.01:ECX[SSE4.2].
>While only crc32c instruction is slow, just expect the crc32c-intel
>driver do not
>match these CPUs.
>
>Sincerely
>Tonyw
Then create a BUG flag for it, or factor out CRC32C into a synthetic flag. We *do not* bury this information in drivers; it becomes a recipe for the same problems over and over.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Powered by blists - more mailing lists