[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150413093446.438bca5e@urahara>
Date: Mon, 13 Apr 2015 09:34:46 -0700
From: Stephen Hemminger <stephen@...workplumber.org>
To: Peter Nørlund <pch@...bogen.com>
Cc: netdev@...r.kernel.org
Subject: Re: ipv4: add hash-based multipath routing
On Sun, 12 Apr 2015 20:54:30 +0200
Peter Nørlund <pch@...bogen.com> wrote:
> Hi all,
>
> I'm working on adding L3/L4 hash-based IPv4 multipath to the kernel,
> but I wonder what the best approach for the mainline kernel is.
>
> When the IPv6 multipath code was added, choosing the routing algorithm
> by means of compile-time config or sysctl was rejected, so I assume
> that we want to revive the RTA_MP_ALGO or a new attribute?
>
> The IPv6 multipath uses L4 balancing - which is fine for IPv6 where
> fragmentation does not happend - but in my opinion the safest default
> for IPv4 is L3, especially when multipath is used together with anycast.
>
> My main problem is the existing multipath code which is really old
> (linux 2.1.66). From the looks of it, it attempts to be somewhat random,
> but in reality it is more or less weighted round-robin, and as far as I
> can tell it even has an off-by-one error in its handling of the random
> value. I think it is wise to support L3, L4, and per-packet
> load-balancing, just like the hardware vendors, but must the per-packet
> load-balancing be default, or is it okay to change the default
> behavior? Also, would a weighted round-robin with a single per-cpu
> counter suffice? This would get rid of the spinlock and avoid causing
> cache invalidations of the route info with each packet. But it would not
> be true round-robin, which would require a per-route-info counter. If
> we are promising round-robin it is bad, but if we are simply promising
> weighted per-packet load-balancing, it's a different matter.
We (Brocade) did some work on this, but it never was done enough to
submit upstream. The ideal is to allow configuring choice of algorithm
per-route.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists