[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<TYZPR06MB520328BE1AEE561EB54F0A53B25C2@TYZPR06MB5203.apcprd06.prod.outlook.com>
Date: Thu, 7 Nov 2024 05:34:50 +0000
From: Chin-Ting Kuo <chin-ting_kuo@...eedtech.com>
To: Patrick Williams <patrick@...cx.xyz>
CC: "joel@....id.au" <joel@....id.au>, "andrew@...econstruct.com.au"
<andrew@...econstruct.com.au>, "wim@...ux-watchdog.org"
<wim@...ux-watchdog.org>, "linux@...ck-us.net" <linux@...ck-us.net>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "linux-aspeed@...ts.ozlabs.org"
<linux-aspeed@...ts.ozlabs.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-watchdog@...r.kernel.org"
<linux-watchdog@...r.kernel.org>, "Peter.Yin@...ntatw.com"
<Peter.Yin@...ntatw.com>, "Patrick_NC_Lin@...ynn.com"
<Patrick_NC_Lin@...ynn.com>, "Bonnie_Lo@...ynn.com" <Bonnie_Lo@...ynn.com>,
"DELPHINE_CHIU@...ynn.com" <DELPHINE_CHIU@...ynn.com>, BMC-SW
<BMC-SW@...eedtech.com>, "chnguyen@...erecomputing.com"
<chnguyen@...erecomputing.com>
Subject: RE: [PATCH v4 1/3] watchdog: aspeed: Update bootstatus handling
Hi Patrick,
Thanks for the check and reply.
> -----Original Message-----
> From: Patrick Williams <patrick@...cx.xyz>
> Sent: Saturday, November 2, 2024 2:21 AM
> Subject: Re: [PATCH v4 1/3] watchdog: aspeed: Update bootstatus handling
>
> On Fri, Nov 01, 2024 at 08:11:59PM +0800, Chin-Ting Kuo wrote:
> > The boot status mapping rule follows the latest design guide from the
> > OpenBMC shown as below.
> >
> https://github.com/openbmc/docs/blob/master/designs/bmc-reboot-cause-up
> date.md#proposed-design
> > - WDIOF_EXTERN1 => system is reset by Software
> > - WDIOF_CARDRESET => system is reset by WDT SoC reset
> > - Others => other reset events, e.g., power on reset.
>
> I'm quite surprised that the above is relevant for a kernel driver at all. Isn't
> "EXTERN1" a name of a real watchdog signal from your hardware (my
> recollection is that there are 2 external watchdogs). I think the point of this
> referenced design document was that most users of BMCs have "EXTERN1"
> used a for software reset conditions.
> `CARDRESET` should be representing resets by the watchdog itself.
>
ASPEED WDT controller is able to generate an external signal, wdt_ext, when the
WDT timeout occurs. However, after system is rebooted, WDT controller doesn't
know whether wdt_ext signal is generated previously. Moreover, whether BMC is
reset due to this wdt_ext signal depends on the HW board design. On some boards,
wdt_ext can affect EXTRST#, BMC external reset signal, indirectly. For this scenario,
EXTERN1 can be classified to wdt_ext. On the other boards, wdt_ext just represents
WDT timeout event for specific tasks. Thus, EXTERN1 boot status can be ignored in
ASPEED WDT driver and just implement "CARDRESET" and "others" since EXTERN1
is not always affected/controlled by WDT controller directly. Or, an additional
property in dts can be added to distinguish whether the EXTRST# reset event is
triggered by wdt_ext signal.
> The purpose of this design proposal was not to require very specific changes to
> individual watchdog drivers, but to align the userspace use with the best
> practices already from other watchdog drivers. I don't think the kernel driver
> should be bending to match a particular userspace implementation; you should
> be exposing the information available to your hardware.
>
Agree.
> Having said that, it was known that there would need to be changes to the
> driver because some of these conditions were not adequately exposed at all.
> I'm just still surprised that we're needing to reference that document as part of
> these changes.
>
Yes, if only "CARDRESET" and "others" types are taken into consideration, some
patches are still needed to complete the boot status mechanism in ASPEED
WDT driver.
> >
> > On ASPEED platform, the boot status is recorded in the SCU registers.
> > - AST2400: Only a bit represents for any WDT reset.
> > - AST2500: The reset triggered by different WDT controllers can be
> > distinguished by different SCU bits. But, WDIOF_EXTERN1 or
> > WDIOF_CARDRESET still cannot be identified due to
> > HW limitation.
> > - AST2600: Different from AST2500, additional HW bits are added for
> > distinguishing WDIOF_EXTERN1 and WDIOF_CARDRESET.
>
> --
> Patrick Williams
Chin-Ting
Powered by blists - more mailing lists