[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ac15a80-6f9a-cf9d-8e79-37d10549a4ca@arm.com>
Date: Thu, 30 Aug 2018 17:32:08 +0100
From: James Morse <james.morse@....com>
To: wufan <wufan@...eaurora.org>
Cc: mchehab@...nel.org, bp@...en8.de, baicar.tyler@...il.com,
linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] EDAC, ghes: use CPER module handles to locate DIMMs
Hi Fan,
On 30/08/18 15:40, wufan wrote:
>>> @@ -327,12 +349,20 @@ void ghes_edac_report_mem_error(int sev,
>> struct cper_sec_mem_err *mem_err)
>>> p += sprintf(p, "bit_pos:%d ", mem_err->bit_pos);
>>> if (mem_err->validation_bits &
>> CPER_MEM_VALID_MODULE_HANDLE) {
>>> const char *bank = NULL, *device = NULL;
>>> + int index = -1;
>>> +
>>> dmi_memdev_name(mem_err->mem_dev_handle, &bank,
>> &device);
>>
>>> + p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
>>> + mem_err->mem_dev_handle);
>>> if (bank != NULL && device != NULL)
>>> p += sprintf(p, "DIMM location:%s %s ", bank, device);
>>> - else
>>> - p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
>>> - mem_err->mem_dev_handle);
>>
>> Why do we now print the handle every time? The handle is pretty
>> meaningless, it can only be used to find the location-strings, if we get those
>> we print them instead.
>
> For ghes_edac the bank/device is informational, and nothing would go wrong
> if the bank/device numbers are the same as another entry. But the handle
> is now critical for DIMM lookup, thus pull it out.
Is printing the handle to the kernel log critical?
I'd expect something collecting errors to read from sysfs, not dmesg. I thought
the whole point here was to update the per-dimm counters in sysfs.
Thanks,
James
Powered by blists - more mailing lists