lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080917064128.2b2f8616@linux.intel.com>
Date:	Wed, 17 Sep 2008 06:41:28 -0700
From:	Arjan van de Ven <arjan@...ux.intel.com>
To:	Jeff Garzik <jeff@...zik.org>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	torvalds@...ux-foundation.org, davem@...emloft.net
Subject: Re: warn: Turn the netdev timeout WARN_ON() into a WARN()

On Tue, 16 Sep 2008 23:27:08 -0400
Jeff Garzik <jeff@...zik.org> wrote:

> On Wed, Sep 17, 2008 at 02:59:12AM +0000, Linux Kernel Mailing List
> wrote:
> 
> hrm, am I misunderstanding?
> 
> AFAICS, this change means the user is no longer notified [after
> the first time] of a condition they really need to know about --
> a hardware or driver bug.
> 
> These conditions can occur many hours or days apart, and the admin
> needs to know EACH time it occurs, because it is a major networking
> event, generally leading to a complete reset of the entire hardware.

ok fair enough; the patch below drops the _ONCE

> 
> And quite honestly, the backtrace is not useful (yes, even the one
> that existing previously)...  THINK for a second.  The backtrace
> is going to look exactly the same, since it is a timer-triggered
> dev_watchdog() call.
> 
> NETDEV WATCHDOG timeouts are not easily fixable errors like lockdep
> warnings, and the admin really does need to see each one.
> 
> Unless I am missing something, (1) this patch should be reverted,

.. or rather changed as below ..

> and in additional, (2) I recommend removing the WARN_ON_ONCE()
> because the backtrace is not helpful.

I disagree on your (2):
First of all, you say that these messages are really critical for the
admin; well, a WARN() is a much more critical message than a printk, and
tends to get a lot more attention to it.
Second, a WARN() is about giving a set of standard information so that
the bugreports we get are more useful the first time. Yes a backtrace
is part of that, and yes you're right just the backtrace won't have new
information here, but a WARN() message also contains things such as the
kernel version, the loaded modules and is generally easier for a user
to make a good bugreport from. That one component of it isn't going to
contain new information so be it.
Third, by using this standardized format, computer programs such as
the kerneloops tools can automatically collect these reports and do
statistics on them.

Your argument that these are not easily fixable may be valid, but
they're as you say serious issues, and at least some portion can be
fixed by doing driver work; I just got an email from an ATL1E developer
as a result of my oopsreport who said he was fixing such issues the
last week for example. If all the networking guys agree that the report
has no value for developers because they're all unfixable hardware bugs,
I'm happy to stop tracking it in kerneloops.org, but I get the impression 
so far that that is not the case.




From: Arjan van de Ven <arjan@...ux.intel.com>
Subject: [PATCH] net: WARN_ONCE -> WARN as requested by Jeff Garzik

Jeff points out that you want to see each and every
timeout message

Signed-off-by: Arjan van de Ven <arjan@...ux.intel.com>
---
 net/sched/sch_generic.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index ec0a083..8772642 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -215,7 +215,7 @@ static void dev_watchdog(unsigned long arg)
 			    time_after(jiffies, (dev->trans_start +
 						 dev->watchdog_timeo))) {
 				char drivername[64];
-				WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit timed out\n",
+				WARN(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit timed out\n",
 				       dev->name, netdev_drivername(dev, drivername, 64));
 				dev->tx_timeout(dev);
 			}
-- 
1.5.5.1



-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ