linux-kernel - Re: [RFC-PATCH 1/2] mm: Add __GFP_NO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200817084719.GB28270@dhcp22.suse.cz>
Date:   Mon, 17 Aug 2020 10:47:19 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     paulmck@...nel.org, Uladzislau Rezki <urezki@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Matthew Wilcox <willy@...radead.org>,
        "Theodore Y . Ts'o" <tytso@....edu>,
        Joel Fernandes <joel@...lfernandes.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

On Sat 15-08-20 01:14:53, Thomas Gleixner wrote:
[...]
> For normal operations a couple of pages which can be preallocated are
> enough. What you are concerned of is the case where you run out of
> pointer storage space.
> 
> There are two reasons why that can happen:
> 
>       1) RCU call flooding
>       2) RCU not being able to run and mop up the backlog
> 
> #1 is observable by looking at the remaining storage space and the RCU
>    call frequency
> 
> #2 is uninteresting because it's caused by RCU being stalled / delayed
>    e.g. by a runaway of some sorts or a plain RCU usage bug.
>    
>    Allocating more memory in that case does not solve or improve anything.
> 
> So the interesting case is #1. Which means we need to look at the
> potential sources of the flooding:
> 
>     1) User space via syscalls, e.g. open/close
>     2) Kernel thread
>     3) Softirq
>     4) Device interrupt
>     5) System interrupts, deep atomic context, NMI ...
> 
> #1 trivial fix is to force switching to an high prio thread or a soft
>    interrupt which does the allocation
> 
> #2 Similar to #1 unless that thread loops with interrupts, softirqs or
>    preemption disabled. If that's the case then running out of RCU
>    storage space is the least of your worries.
> 
> #3 Similar to #2. The obvious candidates (e.g. NET) for monopolizing a
>    CPU have loop limits in place already. If there is a bug which fails
>    to care about the limit, why would RCU care and allocate more memory?
> 
> #4 Similar to #3. If the interrupt handler loops forever or if the
>    interrupt is a runaway which prevents task/softirq processing then
>    RCU free performance is the least of your worries.
> 
> #5 Clearly a bug and making RCU accomodate for that is beyond silly.
> 
> So if call_rcu() detects that the remaining storage space for pointers
> goes below the critical point or if it observes high frequency calls
> then it simply should force a soft interrupt which does the allocation.
>
> Allocating from softirq context obviously without holding the raw lock
> which is used inside call_rcu() is safe on all configurations.
> 
> If call_rcu() is forced to use the fallback for a few calls until this
> happens then that's not the end of the world. It is not going to be a
> problem ever for the most obvious issue #1, user space madness, because
> that case cannot delay the softirq processing unless there is a kernel
> bug which makes again RCU free performance irrelevant.

Yes, this makes perfect sense to me! I really do not think we want to
optimize for a userspace abuse to allow complete pcp allocator memory
depletion (or a control in a worse case).

Thanks!
-- 
Michal Hocko
SUSE Labs