lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 18 Aug 2020 17:02:32 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Uladzislau Rezki <urezki@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Matthew Wilcox <willy@...radead.org>,
        "Theodore Y . Ts'o" <tytso@....edu>,
        Joel Fernandes <joel@...lfernandes.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>
Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

On Tue 18-08-20 06:53:27, Paul E. McKenney wrote:
> On Tue, Aug 18, 2020 at 09:43:44AM +0200, Michal Hocko wrote:
> > On Mon 17-08-20 15:28:03, Paul E. McKenney wrote:
> > > On Mon, Aug 17, 2020 at 10:28:49AM +0200, Michal Hocko wrote:
> > > > On Mon 17-08-20 00:56:55, Uladzislau Rezki wrote:
> > > 
> > > [ . . . ]
> > > 
> > > > > wget ftp://vps418301.ovh.net/incoming/1000000_kmalloc_kfree_rcu_proc_percpu_pagelist_fractio_is_8.png
> > > > 
> > > > 1/8 of the memory in pcp lists is quite large and likely not something
> > > > used very often.
> > > > 
> > > > Both these numbers just make me think that a dedicated pool of page
> > > > pre-allocated for RCU specifically might be a better solution. I still
> > > > haven't read through that branch of the email thread though so there
> > > > might be some pretty convincing argments to not do that.
> > > 
> > > To avoid the problematic corner cases, we would need way more dedicated
> > > memory than is reasonable, as in well over one hundred pages per CPU.
> > > Sure, we could choose a smaller number, but then we are failing to defend
> > > against flooding, even on systems that have more than enough free memory
> > > to be able to do so.  It would be better to live within what is available,
> > > taking the performance/robustness hit only if there isn't enough.
> > 
> > Thomas had a good point that it doesn't really make much sense to
> > optimize for flooders because that just makes them more effective.
> 
> The point is not to make the flooders go faster, but rather for the
> system to be robust in the face of flooders.  Robust as in harder for
> a flooder to OOM the system.

Do we see this to be a practical problem? I am really confused because
the initial argument was revolving around an optimization now you are
suggesting that this is actually system stability measure. And I fail to
see how allowing an easy way to deplete pcp caches completely solves
any of that. Please do realize that if allow that then every user who
relies on pcp caches will have to take a slow(er) path and that will
have performance consequences. The pool is a global and a scarce
resource. That's why I've suggested a dedicated preallocated pool and
use it instead of draining global pcp caches.

> And reducing the number of post-grace-period cache misses makes it
> easier for the callback-invocation-time memory freeing to keep up with
> the flooder, thus avoiding (or at least delaying) the OOM.
> 
> > > My current belief is that we need a combination of (1) either the
> > > GFP_NOLOCK flag or Peter Zijlstra's patch and
> > 
> > I must have missed the patch?
> 
> If I am keeping track, this one:
> 
> https://lore.kernel.org/lkml/20200814215206.GL3982@worktop.programming.kicks-ass.net/

OK, I have certainly noticed that one but didn't react but my response
would be similar to the dedicated gfp flag. This is less of a hack than
__GFP_NOLOCK  but it still exposes very internal parts of the allocator
and I find that a quite problematic from the future maintenance of the
allocator. The risk of an easy depletion of the pcp pool is also there
of course.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ