[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF1J0HMTigLsg3KUMtOUmR0Vyp1nbG3n9pr1JSDLnrDxThAXOA@mail.gmail.com>
Date: Mon, 24 Jun 2013 22:52:09 +0300
From: Mike Rapoport <mike.rapoport@...ellosystems.com>
To: Stephen Hemminger <stephen@...workplumber.org>
Cc: netdev@...r.kernel.org, David Stevens <dlstevens@...ibm.com>,
Thomas Graf <tgraf@...g.ch>
Subject: Re: [PATCH net-next v4 2/2] vxlan: allow specifying multiple default destinations
On Mon, Jun 24, 2013 at 6:35 PM, Stephen Hemminger
<stephen@...workplumber.org> wrote:
> On Mon, 24 Jun 2013 08:57:55 +0300
> Mike Rapoport <mike.rapoport@...ellosystems.com> wrote:
>
>> On Mon, Jun 24, 2013 at 3:14 AM, Stephen Hemminger
>> <stephen@...workplumber.org> wrote:
>> > On Sun, 23 Jun 2013 19:22:23 +0300
>> > Mike Rapoport <mike.rapoport@...ellosystems.com> wrote:
>> >
>> >> A list of multiple default destinations can be used in environments that
>> >> disable multicast on the infrastructure level, e.g. public clouds.
>> >>
>> >> Signed-off-by: Mike Rapoport <mike.rapoport@...ellosystems.com>
>> >> ---
>> >> drivers/net/vxlan.c | 268 +++++++++++++++++++++++++++++++++++++++++--
>> >> include/uapi/linux/if_link.h | 17 +++
>> >> 2 files changed, 276 insertions(+), 9 deletions(-)
>> >>
>> >> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> >> index e5fb6568..f57a0d94 100644
>> >> --- a/drivers/net/vxlan.c
>> >> +++ b/drivers/net/vxlan.c
>> >> @@ -103,6 +103,7 @@ struct vxlan_rdst {
>> >> u32 remote_vni;
>> >> u32 remote_ifindex;
>> >> struct list_head list;
>> >> + struct rcu_head rcu;
>> >> };
>> >
>> > The use of remotes_cnt here is not SMP safe.
>> > You are using remotes_cnt to size the buffer for dumping, but then the list
>> > of remotes might change during the dump.
>>
>> The remotes_cnt is used only in netlink callbacks with rtnl_lock held
>> and it cannot be modified otherwise, so I don't see why it is not SMP
>> safe.
>>
>> > There a a couple of alternatives here:
>> > 1. Put a hard limit on the number of remotes per MAC.
>> > 2. When there are multiple destnations, just dump multiple entries, like
>> > multipath routing does.
>> >
>> > I prefer #2 because it also allows for a cleaner API on creation.
>> >
>>
>
> After a few more hours of review, I think the API still needs more work.
> The API uses attributes IFLA_VXLAN_REMOTE_NEW and IFLA_VXLAN_REMOTE_DEL to
> implement adding and deleting entries. This is contrary to other uses of attributes
> in Linux netlink. The convention is that attributes are are descriptors of objects
> not verbs. The attributes are reported and used on creation.
>
> The API needs to use the netlink message flags to indicate create, replace and delete
> instead. It may mean changes to net/core/rtnetlink.c. I would rather see VXLAN follow
> convention as close as possible.
Just to make sure I've got your point here, the API should use
RTM_NEWSOMETHING, RTM_DELSOMETHING and RTM_GETSOMETHING message types
with attribute SOME_PREFIX_VXLAN_REMOTE, and the attribute itself may
contain sub-attributes, such as remote address, port, vni etc...
If this assumption is correct I could think of the following alternatives:
1) Add RTM_NEWVXLANDST, which seems to me somewhat overkill
2) Add RTA_VXLAN_REMOTE to rtattr_type_t. This way that creation API
will be similar to multipath routing, but I'm not sure that adding
VXLAN specific attribute type to rtattr_type_t is appropriate.
3) Allow zero mac address in rtnl_fdb_{add,del} and than make the
default destinations part of the fdb, as David Stevens suggested (1).
In this case fdb deletion should be reworked so that at least one
default destination will be always kept.
I personally favor (2) because it allows semantic distinction between
fdb entries and default destinations.
> Sorry for being so difficult but once an API is done, it has a long lifetime and other
> stuff tends to follow it. I know from experience having made the mistake far
> to often..
I would prefer to receive such feedback earlier, but I definitely
understand your concern :)
--
[1] http://thread.gmane.org/gmane.linux.network/270969/focus=271791
--
Sincerely yours,
Mike.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists