[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4640E751.3040804@myri.com>
Date: Tue, 08 May 2007 23:10:41 +0200
From: Brice Goglin <brice@...i.com>
To: Jeff Garzik <jeff@...zik.org>
CC: netdev@...r.kernel.org
Subject: Re: [PATCH 4/6] myri10ge: limit the number of recoveries
Jeff Garzik wrote:
> Brice Goglin wrote:
>> Limit the number of recoveries from a NIC hw watchdog reset to
>> 1 by default. This is tweakable via the myri10ge_reset_recover
>> tunable.
>
> NAK. Tunables like this are generally (a) never touched by the vast
> majority of users, and (b) have useful values and purposes known only
> to Myri employees :)
Well, actually, it's kind of the opposite. Myri employees won't need to
tune this value since they will be able to replace the NIC with another
one immediately. The whole point of this tunable is to help end-users:
* The default value (set to 1) enables detection of defective NICs
immediately. These memory parity errors are expected to happen very
rarely (less than once per century per NIC). However, a defective NIC
(very rare, fortunately) can see such an error quite often, ie. every
few minutes under high load.
* An increased limit value will still allow people with mission critical
installations to crank up the tunable and recover an INTMAX number of
times while waiting for a downtime window to replace the NIC. The
performance won't be optimal, but at least, it will still work.
Should I resent the patch?
Thanks,
Brice
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists