[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c965294-fe7d-3893-e9d9-3354ff508731@gmail.com>
Date: Thu, 6 Aug 2020 10:45:52 -0600
From: David Ahern <dsahern@...il.com>
To: Ido Schimmel <idosch@...sch.org>
Cc: Yi Yang (杨燚)-云服务集团
<yangyi01@...pur.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"nikolay@...ulusnetworks.com" <nikolay@...ulusnetworks.com>
Subject: Re: 答复: [PATCH] can current ECMP implementation support consistent hashing for next hop?
On 8/2/20 8:49 AM, Ido Schimmel wrote:
> On Thu, Jun 11, 2020 at 10:36:59PM -0600, David Ahern wrote:
>> On 6/11/20 6:32 PM, Yi Yang (杨燚)-云服务集团 wrote:
>>> David, thank you so much for confirming it can't, I did read your cumulus document before, resilient hashing is ok for next hop remove, but it still has the same issue there if add new next hop. I know most of kernel code in Cumulus Linux has been in upstream kernel, I'm wondering why you didn't push resilient hashing to upstream kernel.
>>>
>>> I think consistent hashing is must-have for a commercial load balancing solution, otherwise it is basically nonsense , do you Cumulus Linux have consistent hashing solution?
>>>
>>> Is "- replacing nexthop entries as LB's come and go" ithe stuff https://docs.cumulusnetworks.com/cumulus-linux/Layer-3/Equal-Cost-Multipath-Load-Sharing-Hardware-ECMP/#resilient-hashing is showing? It can't ensure the flow is distributed to the right backend server if a new next hop is added.
>>
>> I do not believe it is a problem to be solved in the kernel.
>>
>> If you follow the *intent* of the Cumulus document: what is the maximum
>> number of load balancers you expect to have? 16? 32? 64? Define an ECMP
>> route with that number of nexthops and fill in the weighting that meets
>> your needs. When an LB is added or removed, you decide what the new set
>> of paths is that maintains N-total paths with the distribution that
>> meets your needs.
>
> I recently started looking into consistent hashing and I wonder if it
> can be done with the new nexthop API while keeping all the logic in user
> space (e.g., FRR).
>
> The only extension that might be required from the kernel is a new
> nexthop attribute that indicates when a nexthop was last recently used.
The only potential problem that comes to mind is that a nexthop can be
used by multiple prefixes.
But, I'm not sure I follow what the last recently used indicator gives
you for maintaining flows as a group is updated.
> User space can then use it to understand which nexthops to replace when
> a new nexthop is added and when to perform the replacement. In case the
> nexthops are offloaded, it is possible for the driver to periodically
> update the nexthop code about their activity.
>
> Below is a script that demonstrates the concept with the example in the
> Cumulus documentation. I chose to replace the individual nexthops
> instead of creating new ones and then replacing the group.
That is one of the features ... a group points to individual nexthops
and those can be atomically updated without affecting the group.
>
> It is obviously possible to create larger groups to reduce the impact on
> existing flows when a new nexthop is added.
>
> WDYT?
This is inline with my earlier responses, and your script shows an
example of how to manage it. Combine it with the active-backup patch set
and you handle device events too (avoid disrupting size of the group on
device events).
Powered by blists - more mailing lists