lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 30 Apr 2009 10:44:59 +0200
From:	Jesper Dangaard Brouer <jdb@...x.dk>
To:	Ben Hutchings <bhutchings@...arflare.com>
Cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH] sfc: Make temperature warnings/alarms more explicit.

On Thu, 2009-04-30 at 02:25 +0100, Ben Hutchings wrote:
> On Tue, 2009-04-28 at 16:48 +0200, Jesper Dangaard Brouer wrote:
> > The sfc driver can detect different hardware failures via the
> > LM87 system.  One of the failures I have experienced is the
> > temperature alarm, but the error message didn't reveal that this
> > error was temperature related.  I had to read the code to
> > discover that.
> > 
> > I think that the temperature error should be more explicit, in
> > order to warn people before the board is permanently damaged.
> 
> You are right, but...
> 
> > diff --git a/drivers/net/sfc/boards.c b/drivers/net/sfc/boards.c
> > index 4a4c74c..b1822fe 100644
> > --- a/drivers/net/sfc/boards.c
> > +++ b/drivers/net/sfc/boards.c
> > @@ -121,8 +121,10 @@ static int efx_check_lm87(struct efx_nic *efx, unsigned mask)
> >  	if (alarms1 || alarms2) {
> >  		EFX_ERR(efx,
> >  			"LM87 detected a hardware failure (status %02x:%02x)"
> > -			"%s%s\n",
> > +			"%s%s%s\n",
> >  			alarms1, alarms2,
> > +			(alarms1 & (LM87_ALARM_TEMP_INT|LM87_ALARM_TEMP_EXT1))
> > +			 ? " high temperature" : "",
> >  			(alarms1 & LM87_ALARM_TEMP_INT) ? " INTERNAL" : "",
> >  			(alarms1 & LM87_ALARM_TEMP_EXT1) ? " EXTERNAL" : "");
> >  		return -ERANGE;
> 
> We could be more explicit still.  How about:

Yes, the "INTERNAL" and "EXTERNAL" is not very descriptive.

> 
> 		EFX_ERR(efx,
> 			"%s out of range (LM87 status %02x:%02x)\n",
> 			(alarms1 & LM87_ALARM_TEMP_INT) ? "Board temperature" :
> 			(alarms1 & LM87_ALARM_TEMP_EXT1) ? "Controller temperature :
> 			"Voltage",

Notice that both LM87_ALARM_TEMP_INT and LM87_ALARM_TEMP_EXT1 can be set
at the same time.  In that case we only print "Board temperature", is
that good enough?  (I thinks its okay, as the status will provide
developers enough info for debugging the event)

> 			alarms1, alarms2);

I would like to emphesize that its a hardware alarm, by adding "HW
alarm:" see revised patch...

-- 
Med venlig hilsen / Best regards
  Jesper Brouer
  ComX Networks A/S
  Linux Network developer
  Cand. Scient Datalog / MSc.
  Author of http://adsl-optimizer.dk
  LinkedIn: http://www.linkedin.com/in/brouer


[PATCH] sfc: Make temperature warnings/alarms more explicit.

The sfc driver can detect different hardware failures via the
LM87 system.  One of the failures I have experienced is the
temperature alarm, but the error message didn't reveal that this
error was temperature related.  I had to read the code to
discover that.

I think that the temperature error should be more explicit, in
order to warn people before the board is permanently damaged.

Signed-off-by: Jesper Dangaard Brouer <hawk@...x.dk>
---

 drivers/net/sfc/boards.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)


diff --git a/drivers/net/sfc/boards.c b/drivers/net/sfc/boards.c
index 4a4c74c..223866c 100644
--- a/drivers/net/sfc/boards.c
+++ b/drivers/net/sfc/boards.c
@@ -120,11 +120,11 @@ static int efx_check_lm87(struct efx_nic *efx, unsigned mask)
 	alarms2 &= mask >> 8;
 	if (alarms1 || alarms2) {
 		EFX_ERR(efx,
-			"LM87 detected a hardware failure (status %02x:%02x)"
-			"%s%s\n",
-			alarms1, alarms2,
-			(alarms1 & LM87_ALARM_TEMP_INT) ? " INTERNAL" : "",
-			(alarms1 & LM87_ALARM_TEMP_EXT1) ? " EXTERNAL" : "");
+			"HW alarm: %s out of range (LM87 status %02x:%02x)\n",
+			(alarms1 & LM87_ALARM_TEMP_INT) ? "Board temperature" :
+			(alarms1 & LM87_ALARM_TEMP_EXT1) ? "Controller temperature" :
+			 "Voltage",
+			alarms1, alarms2);
 		return -ERANGE;
 	}
 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ