linux-kernel - Re: [PATCH RFC tip/core/rcu] Add shrinker to shift to fast/inefficient GP mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200507172940.GA19915@paulmck-ThinkPad-P72>
Date:   Thu, 7 May 2020 10:29:40 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org, kernel-team@...com, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        mathieu.desnoyers@...icios.com, josh@...htriplett.org,
        tglx@...utronix.de, peterz@...radead.org, rostedt@...dmis.org,
        dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
        oleg@...hat.com, joel@...lfernandes.org, viro@...iv.linux.org.uk,
        Dave Chinner <david@...morbit.com>
Subject: Re: [PATCH RFC tip/core/rcu] Add shrinker to shift to
 fast/inefficient GP mode

On Thu, May 07, 2020 at 10:09:03AM -0700, Paul E. McKenney wrote:
> On Thu, May 07, 2020 at 01:00:06PM -0400, Johannes Weiner wrote:
> > On Wed, May 06, 2020 at 05:55:35PM -0700, Andrew Morton wrote:
> > > On Wed, 6 May 2020 17:42:40 -0700 "Paul E. McKenney" <paulmck@...nel.org> wrote:
> > > 
> > > > This commit adds a shrinker so as to inform RCU when memory is scarce.
> > > > RCU responds by shifting into the same fast and inefficient mode that is
> > > > used in the presence of excessive numbers of RCU callbacks.  RCU remains
> > > > in this state for one-tenth of a second, though this time window can be
> > > > extended by another call to the shrinker.
> > 
> > We may be able to use shrinkers here, but merely being invoked does
> > not carry a reliable distress signal.
> > 
> > Shrinkers get invoked whenever vmscan runs. It's a useful indicator
> > for when to age an auxiliary LRU list - test references, clear and
> > rotate or reclaim stale entries. The urgency, and what can and cannot
> > be considered "stale", is encoded in the callback frequency and scan
> > counts, and meant to be relative to the VM's own rate of aging: "I've
> > tested X percent of mine for recent use, now you go and test the same
> > share of your pool." It doesn't translate well to other
> > interpretations of the callbacks, although people have tried.
> 
> Would it make sense for RCU to interpret two invocations within (say)
> 100ms of each other as indicating urgency?  (Hey, I had to ask!)
> 
> > > > If it proves feasible, a later commit might add a function call directly
> > > > indicating the end of the period of scarce memory.
> > > 
> > > (Cc David Chinner, who often has opinions on shrinkers ;))
> > > 
> > > It's a bit abusive of the intent of the slab shrinkers, but I don't
> > > immediately see a problem with it.  Always returning 0 from
> > > ->scan_objects might cause a problem in some situations(?).
> > > 
> > > Perhaps we should have a formal "system getting low on memory, please
> > > do something" notification API.
> > 
> > It's tricky to find a useful definition of what low on memory
> > means. In the past we've used sc->priority cutoffs, the vmpressure
> > interface (reclaimed/scanned - reclaim efficiency cutoffs), oom
> > notifiers (another reclaim efficiency cutoff). But none of these
> > reliably capture "distress", and they vary highly between different
> > hardware setups. It can be hard to trigger OOM itself on fast IO
> > devices, even when the machine is way past useful (where useful is
> > somewhat subjective to the user). Userspace OOM implementations that
> > consider userspace health (also subjective) are getting more common.
> > 
> > > How significant is this?  How much memory can RCU consume?
> > 
> > I think if rcu can end up consuming a significant share of memory, one
> > way that may work would be to do proper shrinker integration and track
> > the age of its objects relative to the age of other allocations in the
> > system. I.e. toss them all on a clock list with "new" bits and shrink
> > them at VM velocity. If the shrinker sees objects with new bit set,
> > clear and rotate. If it sees objects without them, we know rcu_heads
> > outlive cache pages etc. and should probably cycle faster too.
> 
> It would be easy for RCU to pass back (or otherwise use) the age of the
> current grace period, if that would help.
> 
> Tracking the age of individual callbacks is out of the question due to
> memory overhead, but RCU could approximate this via statistical sampling.
> Comparing this to grace-period durations could give information as to
> whether making grace periods go faster would be helpful.
> 
> But, yes, it would be better to have an elusive unambiguous indication
> of distress.  ;-)

And I have dropped this patch for the time being, but I do hope that
it served a purpose in illustrating that it is not difficult to put RCU
into a fast-but-inefficient mode when needed.

							Thanx, Paul