[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c81fd654-011f-0375-7dde-61365c898efb@roeck-us.net>
Date: Thu, 14 Oct 2021 09:56:44 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Sander Vanheule <sander@...nheule.net>
Cc: linux-watchdog@...r.kernel.org, devicetree@...r.kernel.org,
Wim Van Sebroeck <wim@...ux-watchdog.org>,
Rob Herring <robh+dt@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] watchdog: Add Realtek Otto watchdog timer
On 10/14/21 3:26 AM, Sander Vanheule wrote:
> On Wed, 2021-10-13 at 14:03 -0700, Guenter Roeck wrote:
>> On 10/13/21 12:46 PM, Sander Vanheule wrote:
>>> On Wed, 2021-10-13 at 11:48 -0700, Guenter Roeck wrote:
>>>> On Wed, Oct 13, 2021 at 03:29:00PM +0200, Sander Vanheule wrote:
>>> [...]
>>>
>>>>>
>>>>> diff --git a/drivers/watchdog/realtek_otto_wdt.c
>>>>> b/drivers/watchdog/realtek_otto_wdt.c
>>>>> new file mode 100644
>>>>> index 000000000000..64c9cba6b0b1
>>>>> --- /dev/null
>>>>> +++ b/drivers/watchdog/realtek_otto_wdt.c
>>>>> @@ -0,0 +1,411 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>>> +
>>>>> +/*
>>>>> + * Realtek Otto MIPS platform watchdog
>>>>> + *
>>>>> + * Watchdog timer that will reset the system after timeout, using the
>>>>> selected
>>>>> + * reset mode.
>>>>> + *
>>>>> + * Counter scaling and timeouts:
>>>>> + * - Base prescale of (2 << 25), providing tick duration T_0: 168ms @
>>>>> 200MHz
>>>>> + * - PRESCALE: logarithmic prescaler adding a factor of {1, 2, 4, 8}
>>>>> + * - Phase 1: Times out after (PHASE1 + 1) × PRESCALE × T_0
>>>>> + * Generates an interrupt, WDT cannot be stopped after phase 1
>>>>> + * - Phase 2: starts after phase 1, times out after (PHASE2 + 1) ×
>>>>> PRESCALE × T_0
>>>>> + * Resets the system according to RST_MODE
>>>>
>>>> Why is there a phase2 interrupt if phase2 resets the chip ?
>>>>
>>>
>>> The SoC's reset controller has an interrupt line for phase2, even though
>>> then it then the
>>> WDT also resets the system. I don't have any documentation about this
>>> peripheral; just
>>> some vendor code and there the phase2 interrupt isn't enabled. I mainly
>>> added it here for
>>> completeness.
>>>
>>
>> It seems pointless to mandate an interrupt just for completeness.
>
> Okay, then I will just drop it here. As I understand, the bindings should be as
> complete as possible, so I think the phase2 interrupt definition should remain
> there?
>
I still don't see the point of it if there is no known use case. At the very
least it will need to be optional, but even then I would expect a description
of the use case.
FWIW, technically I suspect that there is a means for the watchdog to generate
a second interrupt instead of resetting the hardware (otherwise the second
interrupt would not really make sense), but without hardware and without
datasheet it is impossible to confirm that.
Guenter
>>
>>> One thing to note is that after CPU or software reset (not SoC reset) the
>>> phase2 flag in
>>> OTTO_WDT_REG_INTR will be set. That's why I always clear it in
>>> otto_wdt_probe(), because
>>> otherwise enabling the interrupt line would trigger otto_wdt_phase2_isr().
>>> On warm
>>> restarts this bit could be used to determine if there was a WDT timeout, but
>>> not if the
>>> WDT is configured for cold restarts (i.e. full SoC reset).
>>>
>>>>
>>> [...]
>>>>> +
>>>>> + raw_spin_lock_irqsave(&ctrl->lock, flags);
>>>>> + v = ioread32(ctrl->base + OTTO_WDT_REG_CTRL);
>>>>> + v |= OTTO_WDT_CTRL_ENABLE;
>>>>> + iowrite32(v, ctrl->base + OTTO_WDT_REG_CTRL);
>>>>> + raw_spin_unlock_irqrestore(&ctrl->lock, flags);
>>>>
>>>> Is it really necessary to disable interrupts for those operations ?
>>>
>>> The ISR routines only use REG_INTR, which isn't modified anywhere else
>>> (outside of probing
>>> the device). I will replace these with raw_spin_{lock,unlock} throughout.
>>>
>>
>> In that case you should not need any locks at all since the watchdog core
>> ensures
>> that the device is opened only once (and thus only one entity can enable or
>> disable
>> the watchdog).
>
> If there is an external guarantee that at most one of {set_timeout,
> set_pretimeout, enable, disable} will be called at a time, I can indeed drop the
> lock. I had added the lock initially because of the read-modify-write operations
> on the control register these ops perform.
>
>
>>
>>> [...]
>>>>> +/*
>>>>> + * The timer asserts the PHASE1/PHASE2 IRQs when the number of ticks
>>>>> exceeds
>>>>> + * the value stored in those fields. This means the timer will run for
>>>>> at least
>>>>> + * one tick, so small values need to be clamped to correctly reflect
>>>>> the timeout.
>>>>> + */
>>>>> +static inline unsigned int div_round_ticks(unsigned int val, unsigned
>>>>> int
>>>>> tick_duration,
>>>>> + unsigned int min_ticks)
>>>>> +{
>>>>> + return max(min_ticks, DIV_ROUND_CLOSEST(val, tick_duration));
>>>>
>>>> Are you sure that DIV_ROUND_CLOSEST is appropriate in those calculations
>>>> (instead of DIV_ROUND_UP or DIV_ROUND_DOWN) ?
>>>>
>>> [...]
>>>
>>>>> +
>>>>> + timeout_ms = total_ticks * tick_ms;
>>>>> + ctrl->wdev.timeout = DIV_ROUND_CLOSEST(timeout_ms, 1000);
>>>>> +
>>>>
>>>> That means the actual timeout (and pretimeout) can be slightly larger
>>>> than the real timeout. Is this really what you want ?
>>>
>>> Is it a problem if the WDT times out later than specified by
>>> watchdog_device.(pre)timeout?
>>> I can see that premature timeouts would be an issue, but I don't suppose
>>> it's problematic
>>> if the WDT is pinged (slightly) sooner than actually required?
>>>
>>
>> I am not concerned with early pings. However, if the timeout limit is set to a
>> value
>> lardger than the real timeout (eg the real timeout is 25.6 seconds and the
>> timeout
>> value is set to 26 seconds), the reset may occur a bit early. Granted, it
>> doesn't
>> matter much, but most driver authors would ensure that the timeout is set to
>> 25 seconds
>> (ie rounded down) in that situation.
>
> I'll replace tick rounding with DIV_ROUND_UP, and timeout rounding with regular
> flooring division. This results in a few timeout values being rounded up for the
> coarsest tick duration, but those are then stable values.
>
> Best,
> Sander
>
>>
>>> The coarsest ticks are 1342 ms, so it is not always possible to provide the
>>> requested
>>> (pre)timeout value, independent of the rounding scheme. Although I think it
>>> should be
>>> possible to replace timeout rounding by DIV_ROUND_UP (of total_ticks_ms),
>>> and pretimeout
>>> rounding by DIV_ROUND_DOWN (of phase2_ticks_ms), and keep stable timeouts
>>> when alternating
>>> between set_timeout/set_pretimeout.
>>>
>>>>
>>>>> + pretimeout_ms = phase2_ticks * tick_ms;
>>>>> + ctrl->wdev.pretimeout = DIV_ROUND_CLOSEST(pretimeout_ms, 1000);
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> +static int otto_wdt_set_timeout(struct watchdog_device *wdev, unsigned
>>>>> int val)
>>>>> +{
>>>>> + struct otto_wdt_ctrl *ctrl = watchdog_get_drvdata(wdev);
>>>>> + unsigned long flags;
>>>>> + unsigned int ret;
>>>>> +
>>>>> + if (watchdog_timeout_invalid(wdev, val))
>>>>> + return -EINVAL;
>>>>
>>>> This is not supposed to happen because the calling code already performs
>>>> range checks.
>>>
>>> Right, I will drop the redundant check here and in set_pretimeout.
>>>
>>>>
>>> [...]
>>>>> +static int otto_wdt_restart(struct watchdog_device *wdev, unsigned long
>>>>> reboot_mode,
>>>>> + void *data)
>>>>> +{
>>>>> + struct otto_wdt_ctrl *ctrl = watchdog_get_drvdata(wdev);
>>>>> + u32 reset_mode;
>>>>> + u32 v;
>>>>> +
>>>>> + devm_free_irq(ctrl->dev, ctrl->irq_phase1, ctrl);
>>>>> +
>>>>
>>>> Why is this needed (instead of, say, disabling the interrupt) ?
>>>
>>> Disabling the interrupt should actually be enough. I'll replace the
>>> devm_free_irq() with
>>> disable_irq(). Somehow I didn't find disable_irq(), even though that was
>>> what I was
>>> looking for...
>>>
>>> [...]
>>>>> +
>>>>> + /*
>>>>> + * Since pretimeout cannot be disabled, min_timeout is twice the
>>>>> + * subsystem resolution. max_timeout is 44s at a bus clock of
>>>>> 200MHz.
>>>>> + */
>>>>> + ctrl->wdev.min_timeout = 2;
>>>>> + max_tick_ms = otto_wdt_tick_ms(ctrl, OTTO_WDT_PRESCALE_MAX);
>>>>> + ctrl->wdev.max_timeout =
>>>>> + DIV_ROUND_CLOSEST(max_tick_ms *
>>>>> OTTO_WDT_TIMEOUT_TICKS_MAX, 1000);
>>>>
>>>> Any reason for using max_timeout instead of max_hw_heartbeat_ms ?
>>>
>>> I must have missed this when I was looking at watchdog_device. Makes sense
>>> to use
>>> max_hw_heartbeat_ms since that reflects the actual value more accurately.
>>>
>>>>
>>>
>>> Thanks for the feedback!
>>>
>>> Best,
>>> Sander
>>>
>>
>
Powered by blists - more mailing lists