[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e50079c-5c17-39d3-ebf5-b7ed3c829151@cumulusnetworks.com>
Date: Fri, 17 Nov 2017 10:01:39 +0200
From: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To: Willy Tarreau <w@....eu>,
Stephen Hemminger <stephen@...workplumber.org>
Cc: Vincent Bernat <bernat@...fy.cx>, Andrew Lunn <andrew@...n.ch>,
Sarah Newman <srn@...mr.com>, netdev@...r.kernel.org,
roopa <roopa@...ulusnetworks.com>
Subject: Re: [PATCH] net: bridge: add max_fdb_count
On 17/11/17 08:14, Nikolay Aleksandrov wrote:
> On 17/11/17 07:26, Willy Tarreau wrote:
>> Hi Stephen,
>>
>> On Thu, Nov 16, 2017 at 04:27:18PM -0800, Stephen Hemminger wrote:
>>> On Thu, 16 Nov 2017 21:21:55 +0100
>>> Vincent Bernat <bernat@...fy.cx> wrote:
>>>
>>>> ? 16 novembre 2017 20:23 +0100, Andrew Lunn <andrew@...n.ch> :
>>>>
>>>>> struct net_bridge_fdb_entry is 40 bytes.
>>>>>
>>>>> My WiFi access point which is also a 5 port bridge, currently has 97MB
>>>>> free RAM. That is space for about 2.5M FDB entries. So even Roopa's
>>>>> 128K is not really a problem, in terms of memory.
>>>>
>>>> I am also interested in Sarah's patch because we can now have bridge
>>>> with many ports through VXLAN. The FDB can be replicated to an external
>>>> daemon with BGP and the cost of each additional MAC address is therefore
>>>> higher than just a few bytes. It seems simpler to implement a limiting
>>>> policy early (at the port or bridge level).
>>>>
>>>> Also, this is a pretty standard limit to have for a bridge (switchport
>>>> port-security maximum on Cisco, set interface X mac-limit on
>>>> Juniper). And it's not something easy to do with ebtables.
>>>
>>> I want an optional limit per port, it makes a lot of sense.
>>> If for no other reason that huge hash tables are a performance problems.
>>
>> Except its not a limit in that it doesn't prevent new traffic from going
>> in, it only prevents new MACs from being learned, so suddenly you start
>> flooding all ports with new traffic once the limit is reached, which is
>> not trivial to detect nor diagnose.
>>
>>> There is a bigger question about which fdb to evict but just dropping the
>>> new one seems to be easiest and as good as any other solution.
>>
>> Usually it's better to apply LRU or random here in my opinion, as the
>> new entry is much more likely to be needed than older ones by definition.
>> In terms of CPU usage it may even be better to kill an entire series in
>> the hash table (eg: all nodes in the same table entry for example), as
>> the operation can be almost as cheap and result in not being needed for
>> a while again.
>>
>> Willy
>>
>
> Hi,
> I have been thinking about this and how to try and keep everyone happy
> while maintaining performance, so how about this:
>
> * Add a per-port fdb LRU list which is used only when there are >= 2000
> global fdb entries _or_ a per-port limit is set. If the list is in use,
> update the fdb list position once per second. I think these properties will
> allow us to scale and have a better LRU locking granularity (per-port), and in
> smaller setups (not needing LRU) the cost will be a single test in fast path.
>
It seems I spoke too soon, I just tried a quick and dirty version of this and the
per-port LRU becomes a nightmare with the completely lockless fdb dst port changes.
So I think going back to the original proposition with new fdb dropping is best
for now, otherwise we need to add some synchronization on fdb port move. In general
the per-port locks would be tricky to handle with moving fdbs, so even if we add
a per-port LRU list we'll need some global lock to handle moves and updates in a
simple manner, and I'd like to avoid adding a new bridge lock.
> * Use the above LRU list for per-port limit evictions
> > * More importantly use the LRU list for fdb expire, removing the need to walk
> over all fdbs every time we expire entries. This would be of great help for
> larger setups with many fdbs (it will kick in on >= 2000 fdb entries).
>
Will have to drop the above point for now, even though I was mostly interested
in that.
> * (optional) Make the global LRU kick in limit an option, people might want to
> minimize traffic blocking due to expire process.
>
> Any comments and suggestions are welcome. When we agree on the details I'll do
> the RFC patches and run some tests before submitting. Defaults will be kept as
> they are now. I've chosen the 2000 limit arbitrarily and am happy to change it
> if people have something else in mind. This should play nice with the resizeable
> hashtable change.
>
> Thanks,
> Nik
>
>
>
>
>
Powered by blists - more mailing lists