lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200422145752.GB362484@cmpxchg.org>
Date:   Wed, 22 Apr 2020 10:57:52 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Joel Fernandes <joel@...lfernandes.org>,
        Uladzislau Rezki <urezki@...il.com>,
        linux-kernel@...r.kernel.org,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        rcu@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RFC] rcu/tree: Refactor object allocation and try harder
 for array allocation

On Thu, Apr 16, 2020 at 11:01:00AM -0700, Paul E. McKenney wrote:
> On Thu, Apr 16, 2020 at 09:17:45AM -0400, Joel Fernandes wrote:
> > On Thu, Apr 16, 2020 at 12:30:07PM +0200, Uladzislau Rezki wrote:
> > > I have a question about dynamic attaching of the rcu_head. Do you think
> > > that we should drop it? We have it because of it requires 8 + syzeof(struct rcu_head)
> > > bytes and is used when we can not allocate 1 page what is much more for array purpose.
> > > Therefore, dynamic attaching can succeed because of using SLAB and requesting much
> > > less memory then one page. There will be higher chance of bypassing synchronize_rcu()
> > > and inlining freeing on a stack.
> > > 
> > > I agree that we should not use GFP_* flags instead we could go with GFP_NOWAIT |
> > > __GFP_NOWARN when head attaching only. Also dropping GFP_ATOMIC to keep
> > > atomic reserved memory for others.
> 
> I must defer to people who understand the GFP flags better than I do.
> The suggestion of __GFP_RETRY_MAYFAIL for no memory pressure (or maybe
> when the CPU's reserve is not yet full) and __GFP_NORETRY otherwise came
> from one of these people.  ;-)

The exact flags we want here depends somewhat on the rate and size of
kfree_rcu() bursts we can expect. We may want to start with one set
and instrument allocation success rates.

Memory tends to be fully consumed by the filesystem cache, so some
form of light reclaim is necessary for almost all allocations.

GFP_NOWAIT won't do any reclaim by itself, but it'll wake kswapd.
Kswapd maintains a small pool of free pages so that even allocations
that are allowed to enter reclaim usually don't have to. It would be
safe for RCU to dip into that.

However, there are some cons to using it:

- Depending on kfree_rcu() burst size, this pool could exhaust (it's
usually about half a percent of memory, but is affected by sysctls),
and then it would fail NOWAIT allocations until kswapd has caught up.

- This pool is shared by all GFP_NOWAIT users, and many (most? all?)
of them cannot actually sleep. Often they would have to drop locks,
restart list iterations, or suffer some other form of deterioration to
work around failing allocations.

Since rcu wouldn't have anything better to do than sleep at this
juncture, it may as well join the reclaim effort.

Using __GFP_NORETRY or __GFP_RETRY_MAYFAIL would allow them that
without exerting too much pressure on the VM.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ