netdev - Re: [PATCH] net: bridge: add max_fdb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9c4d1ae6-97db-26a4-383f-87183cf04428@cumulusnetworks.com>
Date:   Fri, 17 Nov 2017 08:14:10 +0200
From:   Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
To:     Willy Tarreau <w@....eu>,
        Stephen Hemminger <stephen@...workplumber.org>
Cc:     Vincent Bernat <bernat@...fy.cx>, Andrew Lunn <andrew@...n.ch>,
        Sarah Newman <srn@...mr.com>, netdev@...r.kernel.org,
        roopa <roopa@...ulusnetworks.com>
Subject: Re: [PATCH] net: bridge: add max_fdb_count

On 17/11/17 07:26, Willy Tarreau wrote:
> Hi Stephen,
> 
> On Thu, Nov 16, 2017 at 04:27:18PM -0800, Stephen Hemminger wrote:
>> On Thu, 16 Nov 2017 21:21:55 +0100
>> Vincent Bernat <bernat@...fy.cx> wrote:
>>
>>>  ? 16 novembre 2017 20:23 +0100, Andrew Lunn <andrew@...n.ch> :
>>>
>>>> struct net_bridge_fdb_entry is 40 bytes.
>>>>
>>>> My WiFi access point which is also a 5 port bridge, currently has 97MB
>>>> free RAM. That is space for about 2.5M FDB entries. So even Roopa's
>>>> 128K is not really a problem, in terms of memory.  
>>>
>>> I am also interested in Sarah's patch because we can now have bridge
>>> with many ports through VXLAN. The FDB can be replicated to an external
>>> daemon with BGP and the cost of each additional MAC address is therefore
>>> higher than just a few bytes. It seems simpler to implement a limiting
>>> policy early (at the port or bridge level).
>>>
>>> Also, this is a pretty standard limit to have for a bridge (switchport
>>> port-security maximum on Cisco, set interface X mac-limit on
>>> Juniper). And it's not something easy to do with ebtables.
>>
>> I want an optional limit per port, it makes a lot of sense.
>> If for no other reason that huge hash tables are a performance problems.
> 
> Except its not a limit in that it doesn't prevent new traffic from going
> in, it only prevents new MACs from being learned, so suddenly you start
> flooding all ports with new traffic once the limit is reached, which is
> not trivial to detect nor diagnose.
> 
>> There is a bigger question about which fdb to evict but just dropping the
>> new one seems to be easiest and as good as any other solution.
> 
> Usually it's better to apply LRU or random here in my opinion, as the
> new entry is much more likely to be needed than older ones by definition.
> In terms of CPU usage it may even be better to kill an entire series in
> the hash table (eg: all nodes in the same table entry for example), as
> the operation can be almost as cheap and result in not being needed for
> a while again.
> 
> Willy
> 

Hi,
I have been thinking about this and how to try and keep everyone happy
while maintaining performance, so how about this:

 * Add a per-port fdb LRU list which is used only when there are >= 2000
   global fdb entries _or_ a per-port limit is set. If the list is in use,
   update the fdb list position once per second. I think these properties will
   allow us to scale and have a better LRU locking granularity (per-port), and in
   smaller setups (not needing LRU) the cost will be a single test in fast path.

 * Use the above LRU list for per-port limit evictions

 * More importantly use the LRU list for fdb expire, removing the need to walk
   over all fdbs every time we expire entries. This would be of great help for
   larger setups with many fdbs (it will kick in on >= 2000 fdb entries).

 * (optional) Make the global LRU kick in limit an option, people might want to
   minimize traffic blocking due to expire process.

Any comments and suggestions are welcome. When we agree on the details I'll do
the RFC patches and run some tests before submitting. Defaults will be kept as
they are now. I've chosen the 2000 limit arbitrarily and am happy to change it
if people have something else in mind. This should play nice with the resizeable
hashtable change.

Thanks,
 Nik