[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a5f623ba-6df1-42f1-a709-aafa59b004ba@amd.com>
Date: Thu, 2 May 2024 12:02:02 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Borislav Petkov <bp@...en8.de>, Robert Richter <rrichter@....com>
Cc: yazen.ghannam@....com, linux-edac@...r.kernel.org,
linux-kernel@...r.kernel.org, tony.luck@...el.com, x86@...nel.org,
Avadhut.Naik@....com, John.Allen@....com
Subject: Re: [PATCH v2 07/16] x86/mce/amd: Simplify DFR handler setup
On 4/30/24 2:06 PM, Borislav Petkov wrote:
> On Mon, Apr 29, 2024 at 08:34:37PM +0200, Robert Richter wrote:
>> After looking a while into it I think the issue was the following:
>>
>> IBS offset was not enabled by firmware, but MCE already was (due to
>> earlier setup). And mce was (maybe) not on all cpus and only one cpu
>> per socket enabled. The IBS vector should be enabled on all cpus. Now
>> firmware allocated offset 1 for mce (instead of offset 0 as for
>> k8). This caused the hardcoded value (offset 1 for IBS) to be already
>> taken. Also, hardcoded values couldn't be used at all as this would
>> have not been worked on k8 (for mce). Another issue was to find the
>> next free offset as you couldn't examine just the current cpu. So even
>> if the offset on the current was available, another cpu might have
>> that offset already in use. Yet another problem was that programmed
>> offsets for mce and ibs overlapped each other and the kernel had to
>> reassign them (the ibs offset).
>>
>> I hope a remember correctly here with all details.
>
> I think you're remembering it correct because after I read this, a very
> very old and dormant brain cell did light up in my head and said, oh
> yeah, that definitely rings a bell!
>
> :-P
>
> Yazen, this is the type of mess I was talking about.
>
Yep, I see what you mean. Definitely a pain :/
So is this the only known issue? And was it encountered in production
systems? Were/are people using IBS on K8 (Family Fh) systems? I know
that perf got support at this time, but do people still use it?
Just as an example, this project has Family 10h as the earliest supported.
https://github.com/jlgreathouse/AMD_IBS_Toolkit
My thinking is that we can simplify the code if there are no practical
issues. And we can address any reported issues as they come.
If you think that's okay, then I can continue with this particular clean
up. If not, then at least we have some more context here.
I'm sure there will be more topics like this when redoing the MCA init path.
:)
Thanks,
Yazen
Powered by blists - more mailing lists