[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a285c8f8-3276-4628-88b1-a0bc18da5049@amd.com>
Date: Tue, 14 Oct 2025 14:15:40 -0500
From: "Naik, Avadhut" <avadnaik@....com>
To: Borislav Petkov <bp@...en8.de>
Cc: Avadhut Naik <avadhut.naik@....com>, linux-edac@...r.kernel.org,
john.allen@....com, linux-kernel@...r.kernel.org,
Yazen Ghannam <yazen.ghannam@....com>
Subject: [PATCH v3 0/2] Incorporate DRAM address in EDAC messages
On 10/14/2025 12:52, Borislav Petkov wrote:
> On Tue, Oct 14, 2025 at 12:13:36PM -0500, Naik, Avadhut wrote:
>>> The "DRAM address" helps memory vendors analyze failures. System
>>> builders want to collect this data and pass it along to the memory
>>> vendors.
>
> How real is such a use case? It sounds to me like wishful thinking and that no
> one is going to use it in the end and we'll end up warming up the universe
> with electrons needlessly...
>
>>> The DRAM address is not contained in architectural data like
>>> MCA info, and getting the address from MCA requires using additional
>>> system-specific hardware info. It's much more reliable to get the DRAM
>>> address from the system with the error rather than try to post-process
>>> it later.
>
> Ok, a bit better.
>
> Now, why isn't that address part of the tracepoint so that system builders can
> consume structured data instead of parsing scnprintf()-ed strings and trying
> to guess what's there?
>
> Also, some of the fields of TRACE_EVENT(mce_record already contain the fields
> this set is adding - CS or so, for example. So there's redundancy already.
>
Currently, it is being exported through the RAS tracepoint along with the physical
address. Example snippet below:
kworker/4:1-3950 [004] ..... 84373.064068: mc_event: 1 Corrected error: on mc#0csrow#0channel#9 (mc:0 location:0:9:-1 address:0x9ffff000 grain:64 syndrome:0x00000001 Cs: 0x0 Bank Grp: 0x0 Bank Addr: 0x1f Row: 0x27f Column: 0x7e0 RankMul: 0x0 SubChannel: 0x0)
Would you rather have it exported through the mce_record tracepoint?
>> If yes, will add this information to commit messages and resend.
>
> When that happens, remove all text gunk which talks about what a patch does
> - that should be visible from the diff.
>
> And this is not the first time I'm saying this: folks, please stop explaining
> the code.
>
Will do.
> Thx.
>
--
Thanks,
Avadhut Naik
Powered by blists - more mailing lists