[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1329164644.2697.49.camel@bwh-desktop>
Date: Mon, 13 Feb 2012 20:24:04 +0000
From: Ben Hutchings <bhutchings@...arflare.com>
To: Stephen Hemminger <shemminger@...tta.com>
CC: Chris Friesen <chris.friesen@...band.com>,
Jay Vosburgh <fubar@...ibm.com>, <andy@...yhouse.net>,
netdev <netdev@...r.kernel.org>
Subject: Re: [BUG?] bonding, slave selection, carrier loss, etc.
On Mon, 2012-02-13 at 10:48 -0800, Stephen Hemminger wrote:
> On Mon, 13 Feb 2012 12:16:59 -0600
> Chris Friesen <chris.friesen@...band.com> wrote:
>
> > On 02/11/2012 12:52 PM, Ben Hutchings wrote:
> > > On Fri, 2012-02-10 at 17:53 -0800, Jay Vosburgh wrote:
> > >> Chris Friesen<chris.friesen@...band.com> wrote:
> >
> > >>> The best solution would be for bonding to just register for notification
> > >>> of the link going down. Presumably most drivers should be doing that
> > >>> properly by now, and for devices that get interrupt-driven notification
> > >>> of link status changes this would allow the bonding code to react much
> > >>> quicker.
> > >>
> > >> A quick look at some drivers shows that at least acenic still
> > >> doesn't do netif_carrier_off, so converting entirely to a notifier-based
> > >> failover mechanism would break drivers that work today.
> > > [...]
> > >
> > > It might be worth having some sort of feature flag (in priv_flags) that
> > > indicates whether the driver updates the link state. Alternately,
> > > disable polling of a device once you see a notification.
>
> Just fix the drivers to update link state.
> The whole mii polling method of bonding is really leftover from the era of
> 10 years ago when network drivers were stupid and didn't handle carrier.
Lots of hardware doesn't generate link interrupts. Our SFC4000 was
supposed to generate events for link changes, but this didn't work
reliably and so we poll regularly in the driver. I think the older
drivers fail to update carrier because of similar hardware limitations.
If you want to remove link polling from the bonding driver then it has
to live *somewhere*. Rather than requiring every affected driver to
implement the timer or delayed work item, I would suggest you put that
in the networking core and then require drivers to either provide a link
polling function or specify that they don't require polling. Then
export the obvious implementations using ethtool or MII so that drivers
don't have to replicate those.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists