lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d08c77f-8d71-e302-d3f7-24acc6df9414@prgmr.com>
Date:   Thu, 16 Nov 2017 10:23:06 -0800
From:   Sarah Newman <srn@...mr.com>
To:     Willy Tarreau <w@....eu>
Cc:     Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
        netdev@...r.kernel.org, roopa <roopa@...ulusnetworks.com>
Subject: Re: [PATCH] net: bridge: add max_fdb_count

On 11/16/2017 01:58 AM, Willy Tarreau wrote:
> Hi Sarah,
> 
> On Thu, Nov 16, 2017 at 01:20:18AM -0800, Sarah Newman wrote:
>> I note that anyone who would run up against a too-low limit on the maximum
>> number of fdb entries would also be savvy enough to fix it in a matter of
>> minutes.
> 
> I disagree on this point. There's a huge difference between experiencing
> sudden breakage under normal conditions due to arbitrary limits being set
> and being down because of an attack. While the latter is not desirable,
> it's much more easily accepted and most often requires operations anyway.
> The former is never an option.

Yes, being down during an attack is expected, assuming you know you are
being attacked.

Linux bridges can also be used in small embedded devices. With no limit,
the likely result from those devices being attacked is the device gets
thrown away for being unreliable.

> 
> And I continue to think that the default behaviour once the limit is reached
> must not be to prevent new entries from being learned but to purge older
> ones. At least it preserves normal operations.

I'm not disagreeing.

I spent maybe a couple of hours on this patch and was hoping someone else would
find more time to spend on the problem.

> 
> But given the high CPU impact you reported for a very low load, definitely
> something needs to be done
It's nice to think so.

> 
>> They could also default the limit to U32_MAX in their particular
>> distribution if it was a configuration option.
> 
> Well, I'd say that we don't have a default limit on the socket number either
> and that it happens to be the expected behaviour. It's almost impossible to
> find a suitable limit for everyone. People dealing with regular loads never
> read docs and get caught. People working in hostile environments are always
> more careful and will ensure that their limits are properly set.

Neighbor tables for ipv4/ipv6 seem more comparable. gc_thresh3 is 1024 and
typically needs to be adjusted higher for Linux routers.

As you say, there is no default limit that suits everyone. So the question is
really who is burdened with changing the default.

There is a lot of talk of not breaking existing users. The current
implementation is demonstrably vulnerable, and since the problem is likely silent
there's not a good way to know how often it's actually occurred. I note the
tool to trigger it is trivially available and it's a well-known type of attack.

But I understand if there have been an insufficient known number of attacks
to change the current default situation. It could be left to user space to
make a new default if it becomes a demonstrable problem.

If user space has to change at all you could again argue for a pure user-space
solution, but I'm not sure if a pure user-space solution would always have a
chance to fix the problem before the system was brought down.

> 
>> At the moment there is not even a single log message if the problem doesn't
>> result in memory exhaustion.
> 
> This probably needs to be addressed as well
Maybe what's needed is two thresholds, one for warning and one for enforcement.
The warning limit would need to be low enough that the information had a good chance
of being logged before the system was under too much load to be able to convey
that information. The enforcement limit could be left as default inactive until
shown that it needed to be otherwise.

--Sarah

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ