lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240620070140.12854-1-qirui.001@bytedance.com>
Date: Thu, 20 Jun 2024 15:01:40 +0800
From: "$(name)" <qirui.001@...edance.com>
To: tony.luck@...el.com
Cc: bp@...en8.de,
	dave.hansen@...ux.intel.com,
	hpa@...or.com,
	linux-edac@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	mingo@...hat.com,
	qirui.001@...edance.com,
	tglx@...utronix.de,
	x86@...nel.org
Subject: RE: [PATCH] x86/mce: count the number of occurrences of each MCE severity

On 6/19/24 上午1:35, Luck, Tony wrote:
> 
>> In the current implementation, we can only know whether each MCE
>> severity has occurred, and cannot know how many times it has occurred
>> accurately. This submission supports viewing how many times each MCE
>> severity has occurred.
> 
> Is know how many times each case was hit useful? The original commit
> for this code said it was just to check coverage for the mce-test suite.
> 
> 4611a6fa4b37 ("x86, mce: export MCE severities coverage via debugfs")
> 
> So you either covered a case in the severities table, or you didn't. Does it
> help to know that you covered a case multiple times?
> 

In the fault injection test in the laboratory, we inject errors multiple
times and need a counter to tell us how many times each case has
occurred and compare it with the expected number to determine the test
results

In the production environment, the counter can reflect the actual number
of times each MCE error type occurs, which can help us detect the MCE
error distribution of large-scale Data center infrastructure 

>> Due to the limitation of char type, the maximum supported statistics are
>> currently 255 times
>>

How about changing char to u64, which is enough for real-world
situations and won't waste a lot of memory? 

>> Signed-off-by: Rui Qi<qirui.001@...edance.com>
>> ---
>>   arch/x86/kernel/cpu/mce/severity.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
>> index dac4d64dfb2a..a81e34c6e3ee 100644
>> --- a/arch/x86/kernel/cpu/mce/severity.c
>> +++ b/arch/x86/kernel/cpu/mce/severity.c
>> @@ -405,7 +405,7 @@ static noinstr int mce_severity_intel(struct mce *m, struct pt_regs *regs, char
>>                        continue;
>>                if (msg)
>>                        *msg = s->msg;
>> -             s->covered = 1;
>> +             s->covered++;
> 
> Wraparound sets this back to zero. Should this be:
> 
> 	if (s->covered < 255)
> 		s->covered++;
> 
> [Is there a #define for max value in an unsigned char? I could find one. If there is,
> then use that instead of hard coded 255]
> 
>>
>>                return s->sev;
>>        }--
> 
> -Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ