[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f9634662-23b0-7ec7-aed5-754bd5dc81b8@linux.intel.com>
Date: Tue, 10 Dec 2019 11:32:34 -0600
From: Thor Thayer <thor.thayer@...ux.intel.com>
To: Aaro Koskinen <aaro.koskinen@...ia.com>
Cc: linux-kernel@...r.kernel.org, linux-edac@...r.kernel.org
Subject: Re: [Bisected] altera_edac crash on a system without ECC
Hi Aaro,
On 12/4/19 7:25 AM, Aaro Koskinen wrote:
> Hi,
>
> On Tue, Dec 03, 2019 at 05:09:48PM -0600, Thor Thayer wrote:
>> On 11/29/19 10:57 AM, Aaro Koskinen wrote:
>>> Hi,
>>>
>>> I tried booting v5.4 mainline kernel on a stratix10 board with ECC
>>> disabled, and the altera-edac driver (with only SDRAM enabled) is
>>> now crashing the system instead of failing the probe with "No ECC/ECC
>>> disabled".
>>>
>> I apologize for the late reply. I was on vacation.
>>
>> ECC disabled means the sof/jic that you're loading has ECC disabled,
>> correct?
>
> Yes.
>
>>> This seems to have started with commit 08f08bfb7b4c ("EDAC, altera:
>>> Merge Stratix10 into the Arria10 SDRAM probe routine"). With the change,
>>> looks like sdram probe no longer uses SMC calls and instead accesses
>>> the registers directly. The crash looks like this:
>>
>> I haven't seen this. I'd expect both ECC enabled and disabled to fail with
>> the dumps you have below since they'd both need to use the regmap functions.
>
> With ECC enabled it doesn't fail, as the direct register access appears
> to work then (I also checked by reading 0xf8011101 from userspace -
> and it works without an abort).
>
>> Yes, this does look like it is using the register accesses instead of the
>> SMC call. Line 2206 sets the SMC call after determining from the if()
>> statement if it is a Stratix10 or Arria10 and from below it seems to take
>> the Arria10 path.
>
> But that's setting the ecc_mgr_map. I think that altr_check_ecc_deps
> and altr_sdram_probe use a different mapping. Before commit 08f08bfb7b4c
> there was S10 specific altr_s10_sdram_probe() that took care of the SMC,
> but I cannot see how the current code doing that unless I'm missing some
> special magic.
>
>> The altr_check_ecc_deps() call is checking whether ECC is enabled so the
>> probe should fail.
>>
>> I suspect the device tree. Can you verify the following node is in your
>> device tree?
>>
>> sdramedac {
>> compatible = "altr,sdram-edac-s10";
>> altr,sdr-syscon = <&sdr>;
>> interrupts = <16 4>;
>> };
>
> Yes, I'm using the in-tree socfpga_stratix10.dtsi.
>
> A.
>
This appears to be a setup problem in U-Boot where U-Boot only allows
access to the ECC registers if ECC is enabled. However, the Linux EDAC
driver reads the ECC registers to determine if ECC is enabled or not.
A patch that always allows access to the ECC registers has been posted
to U-Boot.
https://patchwork.ozlabs.org/patch/1205274/
Thanks,
Thor
Powered by blists - more mailing lists