lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66dece5d-d2e2-c3e0-3d7a-565385fe5003@roeck-us.net>
Date:   Fri, 8 Oct 2021 06:05:12 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     Jan Kiszka <jan.kiszka@...mens.com>,
        Wim Van Sebroeck <wim@...ux-watchdog.org>
Cc:     linux-watchdog@...r.kernel.org, linux-kernel@...r.kernel.org,
        Mantas Mikulėnas <grawity@...il.com>,
        "Javier S . Pedro" <debbugs@...ispedro.com>
Subject: Re: [PATCH] Revert "watchdog: iTCO_wdt: Account for rebooting on
 second timeout"

On 10/8/21 12:52 AM, Jan Kiszka wrote:
> On 08.10.21 02:33, Guenter Roeck wrote:
>> This reverts commit cb011044e34c ("watchdog: iTCO_wdt: Account for
>> rebooting on second timeout") and commit aec42642d91f ("watchdog: iTCO_wdt:
>> Fix detection of SMI-off case") since those patches cause a regression
>> on certain boards (https://bugzilla.kernel.org/show_bug.cgi?id=213809).
>>
>> While this revert may result in some boards to only reset after twice
>> the configured timeout value, that is still better than a watchdog reset
>> after half the configured value.
>>
>> Fixes: cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on second timeout")
>> Fixes: aec42642d91f ("watchdog: iTCO_wdt: Fix detection of SMI-off case")
>> Cc: Jan Kiszka <jan.kiszka@...mens.com>
>> Cc: Mantas Mikulėnas <grawity@...il.com>
>> Reported-by: Javier S. Pedro <debbugs@...ispedro.com>
>> Signed-off-by: Guenter Roeck <linux@...ck-us.net>
>> ---
>>   drivers/watchdog/iTCO_wdt.c | 12 +++---------
>>   1 file changed, 3 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/watchdog/iTCO_wdt.c b/drivers/watchdog/iTCO_wdt.c
>> index 643c6c2d0b72..ced2fc0deb8c 100644
>> --- a/drivers/watchdog/iTCO_wdt.c
>> +++ b/drivers/watchdog/iTCO_wdt.c
>> @@ -71,8 +71,6 @@
>>   #define TCOBASE(p)	((p)->tco_res->start)
>>   /* SMI Control and Enable Register */
>>   #define SMI_EN(p)	((p)->smi_res->start)
>> -#define TCO_EN		(1 << 13)
>> -#define GBL_SMI_EN	(1 << 0)
>>   
>>   #define TCO_RLD(p)	(TCOBASE(p) + 0x00) /* TCO Timer Reload/Curr. Value */
>>   #define TCOv1_TMR(p)	(TCOBASE(p) + 0x01) /* TCOv1 Timer Initial Value*/
>> @@ -357,12 +355,8 @@ static int iTCO_wdt_set_timeout(struct watchdog_device *wd_dev, unsigned int t)
>>   
>>   	tmrval = seconds_to_ticks(p, t);
>>   
>> -	/*
>> -	 * If TCO SMIs are off, the timer counts down twice before rebooting.
>> -	 * Otherwise, the BIOS generally reboots when the SMI triggers.
>> -	 */
>> -	if (p->smi_res &&
>> -	    (inl(SMI_EN(p)) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN))
>> +	/* For TCO v1 the timer counts down twice before rebooting */
>> +	if (p->iTCO_version == 1)
>>   		tmrval /= 2;
>>   
>>   	/* from the specs: */
>> @@ -527,7 +521,7 @@ static int iTCO_wdt_probe(struct platform_device *pdev)
>>   		 * Disables TCO logic generating an SMI#
>>   		 */
>>   		val32 = inl(SMI_EN(p));
>> -		val32 &= ~TCO_EN;	/* Turn off SMI clearing watchdog */
>> +		val32 &= 0xffffdfff;	/* Turn off SMI clearing watchdog */
>>   		outl(val32, SMI_EN(p));
>>   	}
>>   
>>
> 
> Sigh, how broken is this architecture of the iTCO? Agreed, this leaves
> no option then.
> 
> BTW, the fact that we saw an inconsistency in read-back timeout
> indicates that there is still an issue for the remaining /= 2 case
> (means v1), but I'm loosing interest in fixing those issues, given how
> hard it is to test broadly without breaking users first.
> 

Agreed. This is because the /=2 handling is only implemented in
iTCO_wdt_set_timeout() without matching code in iTCO_wdt_get_timeleft().
I don't have hardware to test, so I am not going to touch that code
myself. We can address that if/when someone reports the actual problem
and has the ability to test a fix.

Thanks,
Guenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ