lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Jul 2018 14:17:21 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     David Ahern <dsahern@...il.com>
Cc:     Cong Wang <xiyou.wangcong@...il.com>,
        David Miller <davem@...emloft.net>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        nikita.leshchenko@...cle.com,
        Roopa Prabhu <roopa@...ulusnetworks.com>,
        Stephen Hemminger <stephen@...workplumber.org>,
        Ido Schimmel <idosch@...lanox.com>,
        Jiri Pirko <jiri@...lanox.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Alexander Aring <alex.aring@...il.com>,
        linux-wpan@...r.kernel.org,
        NetFilter <netfilter-devel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC/RFT net-next 00/17] net: Convert neighbor tables to per-namespace

David Ahern <dsahern@...il.com> writes:

> On 7/25/18 11:38 AM, Eric W. Biederman wrote:
>> 
>> Absolutely NOT.  Global thresholds are exactly correct given the fact
>> you are running on a single kernel.
>> 
>> Memory is not free (Even though we are swimming in enough of it memory
>> rarely matters).  One of the few remaining challenges is for containers
>> is finding was to limit resources in such a way that one application
>> does not mess things up for another container during ordinary usage.
>> 
>> It looks like the neighbour tables absolutely are that kind of problem,
>> because the artificial limits are too strict.   Completely giving up on
>> limits does not seem right approach either.  We need to fix the limits
>> we have (perhaps making them go away entirely), not just apply a
>> band-aid.  Let's get to the bottom of this and make the system better.
>
> Eric: yes, they all share the global resource of memory and there should
> be limits on how many entries a remote entity can create.
>
> Network namespaces can provide a separation such that one namespace does
> not disrupt networking in another. It is absolutely appropriate to do
> so. Your rigid stance is inconsistent given the basic meaning of a
> network namespace and the parallels to this same problem -- bridges,
> vxlans, and ip fragments. Only neighbor tables are not per-device or per
> namespace; your insistence on global limits is missing the mark and wrong.

That is not what I said.  Let me rephrase and see if you understand.

The problem appears to be of lots of devices.  Fundamentally if you use
lots of network devices today unless you adjust gc_thresh3 you will run
out of neighbour table entries.

The problem has a bigger scope than what you are looking at.

If you fix the core problem you won't see the problem in the context
of network namespaces either.

Default limits should be something that will never be hit unless
something goes crazy.  We are hitting them.  Therefore by definition
there is a bug in these limits.


And yes there is absolutely a place for global limits on things like
inodes, file descriptors etc, that does not care about which part of the
kernel you are in.  However hitting those limits in normal operation is
a bug.

We have ourselves a bug.

Eric

p.s. I wrote the definition of network namespaces and it absolutely does
have room for global limits.   One of the things Linus has periodically
yelled at me about is that there are not enough of them.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ