linux-kernel - Re: [PATCH RFC] rcu/tree: Use GFP_MEMALLOC for alloc memory to free memory pattern

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200401122550.GA32593@pc636>
Date:   Wed, 1 Apr 2020 14:25:50 +0200
From:   Uladzislau Rezki <urezki@...il.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Uladzislau Rezki <urezki@...il.com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, rcu@...r.kernel.org, willy@...radead.org,
        peterz@...radead.org, neilb@...e.com, vbabka@...e.cz,
        mgorman@...e.de, Andrew Morton <akpm@...ux-foundation.org>,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RFC] rcu/tree: Use GFP_MEMALLOC for alloc memory to free
 memory pattern

> > I think there should be GFP_ATOMIC used, because it has more chance to
> > return memory then GFP_NOWAIT. I see that Michal has same view on it.
> 
> I don't think so because GFP_ATOMIC implies GFP_NOWAIT. I am Ok with keeping
> the GFP_ATOMIC as it is btw. Paul mentioned he prefers this. I agree with
> that as well.
> 
GFP_ATOMIC can access to reserved memory whereas GFP_NOWAIT is not
eligible to do so. So there is difference between them :)

> > > 
> > > Yes, the benefit of the trace/warning is that the user can switch to a
> > > non-headless API and avoid the synchronize_rcu(), that would help them get
> > > faster kfree_rcu() performance instead of having silent slowdowns.
> > > 
> > Agree. What about just adding WARN_ON_ONCE()? I am just thinking if it
> > could be harmful or not.
> 
> You mean WARN_ON_ONCE() before the synchronize_rcu() right? We could do that.
> Paul mentioned to me he prefers if this new warning can be turned off with a
> boot parameter since some future user may prefer no warning. I also agree.
> 
Yes, we can add it before doing synchronize_rcu(). WARN_ON_ONCE() will
emit only once the warning. I think that would be enough to pay an
attention.

>
> If we add this then we can keep your __GFP_NOWARN flag with no additional GFP
> flag changes.
>
We can also add __GFP_RETRY_MAYFAIL to GFP_ATOMIC to make it more tight.
Basically your patch can be modified just adding that.

> > > It also tells us whether the headless API is worth it in the long run, I
> > > think it is worth it because we will likely never hit the synchronize_rcu()
> > > failsafe. But if we hit it a lot, at least it wont happen silently.
> > > 
> > Agree.
> > 
> > > Paul was concerned about following scenario with hitting synchronize_rcu():
> > > 1. Consider a system under memory pressure.
> > > 2. Consider some other subsystem X depending on another system Y which uses
> > >    kfree_rcu(). If Y doesn't complete the operation in time, X accumulates
> > >    more memory.
> > > 3. Since kfree_rcu() on Y hits synchronize_rcu() a lot, it slows it down.
> > >    This causes X to further allocate memory, further causing a chain
> > >    reaction.
> > > Paul, please correct me if I'm wrong.
> > > 
> > I see your point and agree that in theory it can happen. So, we should
> > make it more tight when it comes to rcu_head attachment logic.
> 
> Right. Per discussion with Paul, we discussed that it is better if we
> pre-allocate N number of array blocks per-CPU and use it for the cache.
> Default for N being 1 and tunable with a boot parameter. I agree with this.
> 
As discussed before, we can make use of memory pool API for such
purpose. But i am not sure if it should be one pool per CPU or
one pool per NR_CPUS, that would contain NR_CPUS * N pre-allocated
blocks.

> In current code, we have 1 cache page per CPU, but this is allocated only on
> the first kvfree_rcu() request. So we could change this behavior as well to
> make it pre-allocated.
> 
> Does this all sound good to you?
> 
I think that makes sense :)

--
Vlad Rezki