[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <499cfd09-0c16-af51-ad17-55d6425ded8e@roeck-us.net>
Date: Mon, 13 Feb 2023 07:26:29 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Xingyu Wu <xingyu.wu@...rfivetech.com>
Cc: linux-riscv@...ts.infradead.org, devicetree@...r.kernel.org,
linux-watchdog@...r.kernel.org,
Wim Van Sebroeck <wim@...ux-watchdog.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Rob Herring <robh+dt@...nel.org>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>,
Philipp Zabel <p.zabel@...gutronix.de>,
Samin Guo <samin.guo@...rfivetech.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/3] drivers: watchdog: Add StarFive Watchdog driver
On 2/13/23 05:29, Xingyu Wu wrote:
> On 2023/2/10 23:28, Guenter Roeck wrote:
>> On 2/9/23 23:01, Xingyu Wu wrote:
>>> On 2023/2/2 6:46, Guenter Roeck wrote:
>>>> On Mon, Dec 19, 2022 at 05:42:32PM +0800, Xingyu Wu wrote:
>>>>> Add watchdog driver for the StarFive JH7110 SoC.
>>>>>
>>>>> Signed-off-by: Xingyu Wu <xingyu.wu@...rfivetech.com>
>>>>
>>
>> [...]
>>
>>>>
>>>>> +MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
>>>>> + __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
>>>>> +MODULE_PARM_DESC(soft_noboot,
>>>>> + "Watchdog action, set to 1 to ignore reboots, 0 to reboot (default 0)");
>>>>
>>>> I do not understand what this module parameter is supposed to be used for,
>>>> and what the "soft_' prefix is supposed to mean.
>>>
>>> This 'soft_noboot' means watchdog reset enabled status. If 'soft_noboot' is set to 1,
>>> it means reset is disabled and do not reboot.Or 'reboot_disbled' instead?
>>>
>>
>> That means it does nothing ? Why load the watchdog in the first place then ?
>>
>
> This feature is used for debugging, so user can check the counter repeatedly
> without rebooting the system.
>
Debug options are not acceptable as module options. Please make it a define
if you think you need it.
>> [...]
>>>>> +
>>>>> +/* interrupt status whether has been raised from the counter */
>>>>> +static bool starfive_wdt_raise_irq_status(struct starfive_wdt *wdt)
>>>>> +{
>>>>> + return !!readl(wdt->base + wdt->drv_data->irq_is_raise);
>>>>> +}
>>>>> +
>>>>> +static void starfive_wdt_int_clr(struct starfive_wdt *wdt)
>>>>> +{
>>>>> + writel(STARFIVE_WDT_INTCLR, wdt->base + wdt->drv_data->int_clr);
>>>>> +}
>>>>
>>>> There is no explanation for this interrupt handling (or, rather,
>>>> non-handling since there is no interrupt handler. What is the purpose
>>>> of even having all this code ?
>>>
>>> Because the watchdog raise an interrupt signal on the hardware when timeout,
>>> although we do not use interrupt handler on the sorfware, but the watchdog
>>> initialization or reload also need to clear the hardware interrupt signal.
>>>
>>
>> That should be documented.
>>
>
> I will add that in the comments.
>
>>
>> [...]
>>>>> +
>>>>> + /*
>>>>> + * This watchdog takes twice timeouts to reset.
>>>>> + * In order to reduce time to reset, should set half count value.
>>>>> + */
>>>>> + count = timeout * freq / 2;
>>>>> +
>>>>> + if (count > STARFIVE_WDT_MAXCNT) {
>>>>
>>>> count is an unsigned int, which is pretty much everywhere a 32-bit
>>>> value. STARFIVE_WDT_MAXCNT is 0xffffffff. How is an unsigned integer
>>>> ever supposed to be larger than 0xffffffff ?
>>>>
>>>>> + dev_warn(wdt->dev, "timeout %d too big,use the MAX-timeout set.\n",
>>>>> + timeout);
>>>>> + timeout = starfive_wdt_max_timeout(wdt);
>>>>> + count = timeout * freq;
>>>>
>>>> This is confusing. First, the timeout range is checked by the calling code,
>>>> which makes sure it is never larger than max_timeout. max_timeout is
>>>> set to the value returned by starfive_wdt_max_timeout().
>>>> Thus, count = timeout * freq / 2 will _never_ be larger than
>>>> STARFIVE_WDT_MAXCNT. Even if it ever was, it doesn't make sense
>>>> to use a count value of timeout * freq in that case, ie a value which
>>>> could be twice as large as the supposed maximum value.
>>>
>>> Change 'count' type to 'u64'. And if 'count' is larger than STARFIVE_WDT_MAXCNT,
>>> 'count' is equal to STARFIVE_WDT_MAXCNT. Does that seem OK?
>>>
>>
>> That would not change anything. This is not about overflows; I would
>> have mentioned that. count can still never be larger than STARFIVE_WDT_MAXCNT.
>> Please do the math.
>>
>
> I get your point. It has checked the 'count' before the ops of 'set_timeout' so
> I check the 'count' uselessly in this. I will remove this.
>
>>
>> [...]
>>>>> +
>>>>> + if (tmr_atboot && started == 0) {
>>>>> + starfive_wdt_start(&wdt->wdt_device);
>>>>> + } else if (!tmr_atboot) {
>>>>> + /*
>>>>> + *if we're not enabling the watchdog, then ensure it is
>>>>> + * disabled if it has been left running from the bootloader
>>>>> + * or other source.
>>>>> + */
>>>>> + starfive_wdt_stop(&wdt->wdt_device);
>>>>
>>>> If it _is_ running from the boot loader, the watchdog core is not
>>>> informed about it. If it is started here, it is not informed either.
>>>> This is unusual and will need to be explained.
>>>> Why ?
>>>
>>> Is is okay to remove the 'started'?
>>>
>> Yes, though of course it would be better if the watchdog is kept running
>> in that situation and the watchdog core is informed about it. Also,
>> the watchdog core is still not informed that the watchdog is running
>> (i.e., WDOG_HW_RUNNING is not set) when it is explicitly started.
>>
>
> Will remove the 'started'.
>
>>>>
>>>>> + }
>>>>> +
>>>>> +#ifdef CONFIG_PM
>>>>> + clk_disable_unprepare(wdt->core_clk);
>>>>> + clk_disable_unprepare(wdt->apb_clk);
>>>>> +#endif
>>>>
>>>> I do not understand the above code. Why stop the watchdog if CONFIG_PM
>>>> is enabled, even if it is supposed to be running ?
>>>
>>> There misses a check about 'early_enable' and add 'if (!early_enable)'.
>>> There means that disable clock when watchdog sleep and CONFIG_PM is enable.
>>> Then enable clock when watchdog work by 'starfive_wdt_runtime_resume' function.
>>>
>>
>> I am quite sure that you are supposed to use pm functions for that purpose,
>> such as pm_runtime_get_sync(), pm_runtime_put_sync(), and pm_runtime_enable(),
>> similar to the code in omap_wdt.c.
>>
>
> I will use pm_runtime_get_sync() and pm_runtime_put_sync() to operate clocks.
>
>> [...]
>>>>> +#ifdef CONFIG_PM_SLEEP
>>>>> +static int starfive_wdt_suspend(struct device *dev)
>>>>> +{
>>>>> + int ret;
>>>>> + struct starfive_wdt *wdt = dev_get_drvdata(dev);
>>>>> +
>>>>> + starfive_wdt_unlock(wdt);
>>>>> +
>>>>> + /* Save watchdog state, and turn it off. */
>>>>> + wdt->reload = starfive_wdt_get_count(wdt);
>>>>> +
>>>>> + starfive_wdt_mask_and_disable_reset(wdt, true);
>>>>> +
>>>>> + /* Note that WTCNT doesn't need to be saved. */
>>>>> + starfive_wdt_stop(&wdt->wdt_device);
>>>>> + pm_runtime_force_suspend(dev);
>>>>> +
>>>>> + starfive_wdt_lock(wdt);
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> +static int starfive_wdt_resume(struct device *dev)
>>>>> +{
>>>>> + int ret;
>>>>> + struct starfive_wdt *wdt = dev_get_drvdata(dev);
>>>>> +
>>>>> + starfive_wdt_unlock(wdt);
>>>>> +
>>>>> + pm_runtime_force_resume(dev);
>>>>> +
>>>>> + /* Restore watchdog state. */
>>>>> + starfive_wdt_set_reload_count(wdt, wdt->reload);
>>>>> +
>>>>> + starfive_wdt_restart(&wdt->wdt_device, 0, NULL);
>>>>
>>>> I do not understand this call. Per its definition it is a restart handler,
>>>> supposed to restart the hardware. Why would anyone want to restart the
>>>> system on resume ?
>>>
>>> The watchdog start counting from 'count' to 0 on everytime resume like a restart.
>>> So I directly use a restart.
>>>
>>
>> That doesn't answer my question. The "restart" callback resets the hardware.
>> starfive_wdt_restart() is registered as restart handler, and thus expected
>> to reset the hardware. It it doesn't reset the hardware, it should not
>> register itself as restart handler. If it does restart the hardware, calling
>> it on resume would automatically reset the system on each resume.
>> Something is wrong here, and will have to be fixed.
>>
>> I _suspect_ that you think that the restart callback is supposed to reset
>> the watchdog. That would be wrong. It resets (restarts) the hardware,
>> not the watchdog. Please read Documentation/watchdog/watchdog-kernel-api.rst
>> if there are questions about this callback.
>>
>
> Thanks you for reminding me. I finally understand that the restart callback is supposed
> to reset the hardware not watchdog. This watchdog doesn't reset the hardware, and I will
> remove that. By the way, if I don't need restart callback, would I still use the
> 'watchdog_set_restart_priority' function? Thanks.
>
No.
Thanks,
Guenter
> Best regards,
> Xingyu Wu
Powered by blists - more mailing lists