[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4A944AAB.6050504@free.fr>
Date: Tue, 25 Aug 2009 22:33:47 +0200
From: Nicolas de Pesloüan
<nicolas.2p.debian@...e.fr>
To: Jay Vosburgh <fubar@...ibm.com>,
Stephen Hemminger <shemminger@...tta.com>
CC: netdev@...r.kernel.org, bonding-devel@...ts.sourceforge.net,
davem@...emloft.net, Jiri Pirko <jpirko@...hat.com>
Subject: Re: [Bonding-devel] [PATCH net-next-2.6] bonding: introduce primary_lazy
option
Jay Vosburgh wrote:
> Nicolas de Pesloüan <nicolas.2p.debian@...e.fr> wrote:
>> Thinking about all that, I start feeling that some sort of user space system to
>> select the "best" slave would be better. If we can design a NETLINK interface to
>> report events (slave up, slave down...) to user space, then any user space
>> daemon would be able to tell bonding what to do. Only if no process register to
>> receive those events would bonding use the normal slave selection rules.
>
> This has been discussed more than once in the past, but hasn't
> ever really gotten anywhere. I suspect the main impediment is the lack
> of a suitable API.
Does a 'NETLINK for bonding' document, describing the proposed API, exist ?
I imagine two different parts in the API :
1/ Everything related to configuration (set and read). This should be not far
from the current sysfs API.
2/ Event notification about "everything" that happens into bonding, to be able
to notify user space in real time.
It might be also interesting to use the netlink API to notify whoever is
interested that a given not-enslaved interface just received a 802.3ad related
packet. This would allow for self enslavement of slaves into the same bond, when
they happen to be connected to the same 802.3ad capable switch.
>> Designing such a NETLINK interface would replace my proposed weight option (at
>> least for best slave selection in active-backup mode and for best aggregator
>> selection in 802.3ad mode). It would also solve the problem reported by Jirka
>> and so replace the proposed primary_lazy option.
>
> Yes, a lot of the decision making at failover could be moved
> into a user space daemon. The daemon, I think, should be optional; if
> the basic selection policies are sufficient, then there's no need for a
> trip to user space and back.
>
>> Any way, NETLINK is something that is supposed to come into bonding at some
>> times, because we know that the sysfs purists hate the sysfs bonding stuff and
>> that NETLINK is the target to setup networking.
>
> I'm not a big fan of the sysfs API, either; it seemed like a
> good idea at the time. It's certainly better than ifenslave in terms of
> features, but some of it is pretty convoluted, and there are things that
> just can't be done from within sysfs.
>
> I recall seeing a note from Stephen Hemminger not too long ago
> (a month or two ago) that he was working on a netlink API for bonding,
> but I don't know how far that ever got.
Yes, I also read this note and remembered he detected that many things need to
be changed before... :-(
> One quesiton is, if a netlink API is implemented, whether to
> convert ifenslave, or deprecate ifenslave and put the various bonding
> functions into ip.
I suggest enhancing ip and removing ifenslave (or converting it to a script that
would call ip internally, for backward compatibility). Why would we need a
dedicated tool for bonding ? We can even write this script now and use sysfs
instead of the ioctl, waiting for the netlink API.
> If a netlink API is on the relatively near horizon (say, within
> a few months), then I'm less inclined to put in the "lazy" option, since
> it would just become baggage carried forward for the next several years
> (until the sysfs API could be deprecated and removed).
Does that mean you suggest Jiri works with Stephen on the netlink API ? :-)
Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists