linux-kernel - Re: [PATCH RFC] Watchdog: sbsa

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160503143856.GE13045@dhcppc6.redhat.com>
Date:	Tue, 3 May 2016 20:08:56 +0530
From:	Pratyush Anand <panand@...hat.com>
To:	Guenter Roeck <linux@...ck-us.net>
Cc:	fu.wei@...aro.org, Suravee.Suthikulpanit@....com,
	timur@...eaurora.org, wim@...ana.be,
	linux-arm-kernel@...ts.infradead.org,
	linux-watchdog@...r.kernel.org,
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range

Hi Guenter,

On 03/05/2016:06:29:39 AM, Guenter Roeck wrote:
> On 05/03/2016 01:20 AM, Pratyush Anand wrote:
> >Currently only WOR is used to program both first and second stage which
> >provided very limited range of timeout.
> >
> >This patch uses WCV as well to achieve higher range of timeout. This patch
> >programs max_timeout as 255, but that can be increased further as well.
> >
> >Following testing shows that we can happily achieve 40 second default timeout.
> >
> >  # modprobe sbsa_gwdt action=1
> >  [  131.187562] sbsa-gwdt sbsa-gwdt.0: Initialized with 40s timeout @ 250000000 Hz, action=1.
> >  # cd /sys/class/watchdog/watchdog0/
> >  # cat state
> >  inactive
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  161.710593] watchdog: watchdog0: watchdog did not stop!
> >  # cat state
> >  active
> >  # cat timeout
> >  40
> >  # cat timeleft
> >  38
> >  # cat timeleft
> >  25
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  184.931030] watchdog: watchdog0: watchdog did not stop!
> >  # cat timeleft
> >  37
> >  # cat timeleft
> >  21
> >  ...
> >  ...
> >  # cat timeleft
> >  1
> >
> >panic() is called upon timeout of 40s. See timestamp of last kick (cat) and
> >next panic() message.
> >
> >  [  224.939065] Kernel panic - not syncing: SBSA Watchdog timeout
> >
> >Signed-off-by: Pratyush Anand <panand@...hat.com>
> 
> You could also use the new infrastructure (specify max_hw_heartbeat_ms instead
> of max_timeout), and not depend on the correct implementation of WCV.

Thanks for pointing to max_hw_heartbeat_ms. Just gone through it. Certainly it
would be helpful, and some part of this patch will go away. 

In fact after supporting max_hw_heartbeat_ms, there should be no change for
action=0 functionally. However, we would still need some changes for action=1.

When action=1, isr is called, which calls panic(). Calling panic() will further
trigger a dump saving mechanism, which can cause to execute a secondary kernel.
Now, it might happen that with the limited timeout (max_hw_heartbeat_ms)
programmed in first kernel, we land into a reset before secondary kernel could
start kicking it again or would complete dump save. 
So, in my opinion:
(1) We should use max_hw_heartbeat_ms.
(2) Then we should overwrite WCV in ISR so that it ensures a timeout of user
programmed "timeout" value for hardware reset.

~Pratyush