[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2260c16e-7bb9-7315-5d5b-8656ca1ba93e@oracle.com>
Date: Thu, 28 Jul 2022 20:29:38 +0000
From: Jane Chu <jane.chu@...cle.com>
To: Dan Williams <dan.j.williams@...el.com>,
"tony.luck@...el.com" <tony.luck@...el.com>,
"bp@...en8.de" <bp@...en8.de>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-edac@...r.kernel.org" <linux-edac@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"hch@....de" <hch@....de>,
"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>
Subject: Re: [PATCH v4] x86/mce: retrieve poison range from hardware
On 7/28/2022 11:46 AM, Dan Williams wrote:
> Jane Chu wrote:
>> On 7/27/2022 1:01 PM, Dan Williams wrote:
>>> Jane Chu wrote:
>>>> On 7/27/2022 12:30 PM, Jane Chu wrote:
>>>>> On 7/27/2022 12:24 PM, Jane Chu wrote:
>>>>>> On 7/27/2022 11:56 AM, Dan Williams wrote:
>>>>>>> Jane Chu wrote:
>>>>>>>> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine
>>>>>>>> poison granularity") that changed nfit_handle_mce() callback to report
>>>>>>>> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
>>>>>>>> discovered that the mce->misc LSB field is 0x1000 bytes, hence
>>>>>>>> injecting
>>>>>>>> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
>>>>>>>> because 0x1000 bytes is 8 512-byte.
>>>>>>>>
>>>>>>>> Dan Williams noticed that apei_mce_report_mem_error() hardcode
>>>>>>>> the LSB field to PAGE_SHIFT instead of consulting the input
>>>>>>>> struct cper_sec_mem_err record. So change to rely on hardware whenever
>>>>>>>> support is available.
>>>>>>>>
>>>>>>>> Link:
>>>>>>>> https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
>>>>>>>>
>>>>>>>>
>>>>>>>> Reviewed-by: Dan Williams <dan.j.williams@...el.com>
>>>>>>>> Signed-off-by: Jane Chu <jane.chu@...cle.com>
>>>>>>>> ---
>>>>>>>> arch/x86/kernel/cpu/mce/apei.c | 14 +++++++++++++-
>>>>>>>> 1 file changed, 13 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> b/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> index 717192915f28..26d63818b2de 100644
>>>>>>>> --- a/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> +++ b/arch/x86/kernel/cpu/mce/apei.c
>>>>>>>> @@ -29,15 +29,27 @@
>>>>>>>> void apei_mce_report_mem_error(int severity, struct
>>>>>>>> cper_sec_mem_err *mem_err)
>>>>>>>> {
>>>>>>>> struct mce m;
>>>>>>>> + int grain = PAGE_SHIFT;
>>>>>>>> if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>>>>>>>> return;
>>>>>>>> + /*
>>>>>>>> + * Even if the ->validation_bits are set for address mask,
>>>>>>>> + * to be extra safe, check and reject an error radius '0',
>>>>>>>> + * and fallback to the default page size.
>>>>>>>> + */
>>>>>>>> + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) {
>>>>>>>> + grain = ~mem_err->physical_addr_mask + 1;
>>>>>>>> + if (grain == 1)
>>>>>>>> + grain = PAGE_SHIFT;
>>>>>>>
>>>>>>> Wait, if @grain is the number of bits to mask off the address, shouldn't
>>>>>>> this be something like:
>>>>>>>
>>>>>>> grain = min_not_zero(PAGE_SHIFT,
>>>>>>> hweight64(~mem_err->physical_addr_mask));
>>>>>>
>>>>>> I see. I guess what you meant is
>>>>>> grain = min(PAGE_SHIFT, (1 +
>>>>>> hweight64(~mem_err->physical_addr_mask)));
>>>>>
>>>>> Sorry, take that back, it won't work either.
>>>>
>>>> This will work,
>>>> grain = min_not_zero(PAGE_SHIFT - 1,
>>>> hweight64(~mem_err->physical_addr_mask));
>>>> grain++;
>>>> but too sophisticated? I guess I prefer the simple "if" expression.
>>>
>>> An "if" is fine, I was more pointing out that:
>>>
>>> hweight64(~mem_err->physical_addr_mask) + 1
>>>
>>> ...and:
>>>
>>> ~mem_err->physical_addr_mask + 1;
>>>
>>> ...give different results.
>>
>> They are different indeed. hweight64 returns the count of set bit while
>> ~mem_err->physical_addr_mask returns a negated value.
>>
>> According to the definition of "Physical Address Mask" -
>>
>> https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
>>
>> Table N-31 Memory Error Record
>>
>> Physical Address Mask 24 8 Defines the valid address bits in the
>> Physical Address field. The mask specifies the granularity of the
>> physical address which is dependent on the hw/ implementation factors
>> such as interleaving.
>>
>> It appears that "Physical Address Mask" is defined more like PAGE_MASK
>> rather than in bitops hweight64() ofter used to count the set bits as
>> an indication of (e.g.) how many registers are in use.
>>
>> Ans similar to PAGE_MASK, a valid "Physical Address Mask" should
>> consist of a contiguous low 0 bits, not 1's and 0's mixed up.
>>
>> So far, as far as I can see, the v4 patch still looks correct to me.
>> Please let me know if I'm missing anything.
>
> The v4 patch looks broken to me. If the address mask is
> 0xffffffffffffffc0 to indicate a cacheline error then:
>
> ~mem_err->physical_addr_mask + 1;
>
> ...results in a grain of 64 when it should be 6.
Right, it's the exponent that's needed, so back to __ffs64().
Sorry for the detour. v5 is coming next.
thanks!
-jane
Powered by blists - more mailing lists