[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080917032708.GA8431@havoc.gtf.org>
Date: Tue, 16 Sep 2008 23:27:08 -0400
From: Jeff Garzik <jeff@...zik.org>
To: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
torvalds@...ux-foundation.org, davem@...emloft.net,
arjan@...ux.intel.com
Subject: Re: warn: Turn the netdev timeout WARN_ON() into a WARN()
On Wed, Sep 17, 2008 at 02:59:12AM +0000, Linux Kernel Mailing List wrote:
>
> this patch turns the netdev timeout WARN_ON_ONCE() into a WARN_ONCE(),
> so that the device and driver names are inside the warning message.
> This helps automated tools like kerneloops.org to collect the data
> and do statistics, as well as making it more likely that humans
> cut-n-paste the important message as part of a bugreport.
>
> Signed-off-by: Arjan van de Ven <arjan@...ux.intel.com>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
>
> +#define WARN_ONCE(condition, format...) ({ \
> + static int __warned; \
> + int __ret_warn_once = !!(condition); \
> + \
> + if (unlikely(__ret_warn_once)) \
> + if (WARN(!__warned, format)) \
> + __warned = 1; \
> + unlikely(__ret_warn_once); \
> +})
> +
> #define WARN_ON_RATELIMIT(condition, state) \
> WARN_ON((condition) && __ratelimit(state))
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 9634091..ec0a083 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -215,10 +215,9 @@ static void dev_watchdog(unsigned long arg)
> time_after(jiffies, (dev->trans_start +
> dev->watchdog_timeo))) {
> char drivername[64];
> - printk(KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit timed out\n",
> + WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit timed out\n",
> dev->name, netdev_drivername(dev, drivername, 64));
> dev->tx_timeout(dev);
> - WARN_ON_ONCE(1);
hrm, am I misunderstanding?
AFAICS, this change means the user is no longer notified [after
the first time] of a condition they really need to know about --
a hardware or driver bug.
These conditions can occur many hours or days apart, and the admin
needs to know EACH time it occurs, because it is a major networking
event, generally leading to a complete reset of the entire hardware.
And quite honestly, the backtrace is not useful (yes, even the one
that existing previously)... THINK for a second. The backtrace
is going to look exactly the same, since it is a timer-triggered
dev_watchdog() call.
NETDEV WATCHDOG timeouts are not easily fixable errors like lockdep
warnings, and the admin really does need to see each one.
Unless I am missing something, (1) this patch should be reverted,
and in additional, (2) I recommend removing the WARN_ON_ONCE()
because the backtrace is not helpful.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists