[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00500974-474d-3559-c141-3cc758bc0423@canonical.com>
Date: Thu, 18 Aug 2022 10:17:22 +0200
From: Heinrich Schuchardt <heinrich.schuchardt@...onical.com>
To: Daire.McNamara@...rochip.com
Cc: linux-riscv@...ts.infradead.org,
emil.renner.berthing@...onical.com, devicetree@...r.kernel.org,
paul.walmsley@...ive.com, aou@...s.berkeley.edu,
palmer@...belt.com, geert@...ux-m68k.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org,
atishp@...osinc.com, krzysztof.kozlowski+dt@...aro.org,
robh+dt@...nel.org, Conor.Dooley@...rochip.com
Subject: Re: [PATCH 1/1] riscv: dts: microchip: correct L2 cache interrupts
On 8/18/22 09:03, Daire.McNamara@...rochip.com wrote:
> On Wed, 2022-08-17 at 18:04 +0000, Conor Dooley - M52691 wrote:
>> Hey Heinrich,
>> Interesting CC list you got there! Surprised the mailmap didn't sort
>> out Atish & Krzysztof's addresses, but I think I've fixed them up.
>> I see Daire isn't there either so +CC him too.
>>
>> On 17/08/2022 14:25, Heinrich Schuchardt wrote:
>>> EXTERNAL EMAIL: Do not click links or open attachments unless you
>>> know the content is safe
>>>
>>> The "PolarFire SoC MSS Technical Reference Manual" documents the
>>> following PLIC interrupts:
>>>
>>> 1 - L2 Cache Controller Signals when a metadata correction event
>>> occurs
>>> 2 - L2 Cache Controller Signals when an uncorrectable metadata
>>> event occurs
>>> 3 - L2 Cache Controller Signals when a data correction event occurs
>>> 4 - L2 Cache Controller Signals when an uncorrectable data event
>>> occurs
>>>
>>> This differs from the SiFive FU540 which only has three L2 cache
>>> related
>>> interrupts.
>>>
>>> The sequence in the device tree is defined by an enum:
> in drivers/soc/sifive/sifive_l2_cache.c
>>>
>>> enum {
>>> DIR_CORR = 0,
>>> DATA_CORR,
>>> DATA_UNCORR,
>>> DIR_UNCORR,
>>> };
>>
>> Nit: more accurately by the dt-binding:
>> interrupts:
>> minItems: 3
>> items:
>> - description: DirError interrupt
>> - description: DataError interrupt
>> - description: DataFail interrupt
>> - description: DirFail interrupt
>>
>> I do find the names in the enum to be a bit more understandable
>> however,
>> and ditto for the descriptions in our TRM... Maybe I should put that
>> on
>> my todo list of cleanups :)
>>
>>
>>> So the correct sequence of the L2 cache interrupts is
>>>
>>> interrupts = <1>, <3>, <4>, <2>;
>>
>> This looks correct to me. You mentioned on IRC that what you were
>> seeing
>> was a wall of
>> L2CACHE: DataFail @ 0x00000000.0807FFD8
>> From a quick look at the driver, what seems to be happening here is
>> that
>> at some point (possibly before Linux even comes into the picture)
>> there
>> is an uncorrectable data error. Because the ordering in the dt is
>> wrong,
>> we read the wrong register and so the interrupt is never actually
>> cleared. With this patch applied, I see a single DataFail right as
>> the
>> interrupt gets registed & nothing after that.
>>
>> I am not really sure what value there is in enabling that driver
>> though,
>> mostly just seems like a debugging tool & from our pov we would see
>> the
>> HSS running in the monitor core as being responsible for handling the
>> l2-cache errors.
>>
>> @Daire, maybe you have an opinion here?
> Likewise. The new ordering of the interrupts to what the driver expects
> looks correct - as far as it goes. However, I'm not convinced enabling
> the SiFive l2 cache driver out of the box makes sense. Using l2 cache
> driver doesn't align terribly well with the current MPFS roadmap for
> mgt of ECC errors.
>>
>> Patch LGTM, so I'll likely apply it in the next day or two, would
>> just
>> like to see what Daire has to say first.
> If l2-cache controller is enabled, then interrupts should be connected
> as per TRM. I think this specific patch lgtm, ideally with a
> 'disabled' stanza and it's up to individual MPFS customers/boards to
> enable l2 cache controller if they want it.
Disabling the device in the device-tree is orthogonal to fixing the
interrupt sequence. I would suggest that you use a separate patch for
adding status = "disabled";.
Best regards
Heinrich
>>
>>> Fixes: e35b07a7df9b ("riscv: dts: microchip: mpfs: Group tuples in
>>> interrupt properties")
>>
>> BTW, it isn't really fixing this patch right? This is a dependency
>> for
>> backports to 5.15.
>>
>> Thanks for your patch,
>> Conor.
>>
>>> Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE
>>> board")
>>> Cc: Conor Dooley <conor.dooley@...rochip.com>
>>> Cc: stable@...r.kernel.org
>>> Signed-off-by: Heinrich Schuchardt <
>>> heinrich.schuchardt@...onical.com>
>>> ---
>>> arch/riscv/boot/dts/microchip/mpfs.dtsi | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/riscv/boot/dts/microchip/mpfs.dtsi
>>> b/arch/riscv/boot/dts/microchip/mpfs.dtsi
>>> index 496d3b7642bd..ec1de6344be9 100644
>>> --- a/arch/riscv/boot/dts/microchip/mpfs.dtsi
>>> +++ b/arch/riscv/boot/dts/microchip/mpfs.dtsi
>>> @@ -169,7 +169,7 @@ cctrllr: cache-controller@...0000 {
>>> cache-size = <2097152>;
>>> cache-unified;
>>> interrupt-parent = <&plic>;
>>> - interrupts = <1>, <2>, <3>;
>>> + interrupts = <1>, <3>, <4>, <2>;
>>> };
>>>
>>> clint: clint@...0000 {
>>> --
>>> 2.36.1
>>>
Powered by blists - more mailing lists