linux-kernel - Re: [PATCH RT 0/4] Address rcutorture issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190620191259.GT26519@linux.ibm.com>
Date:   Thu, 20 Jun 2019 12:12:59 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Scott Wood <swood@...hat.com>
Cc:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Juri Lelli <juri.lelli@...hat.com>,
        Clark Williams <williams@...hat.com>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RT 0/4] Address rcutorture issues

On Tue, Jun 18, 2019 at 08:19:04PM -0500, Scott Wood wrote:
> With these patches, rcutorture mostly works on PREEMPT_RT_FULL.  I still
> once in a while get forward progress complaints (particularly,
> rcu_torture_fwd_prog_cr) when a grace period is held up for a few seconds
> after which point so many callbacks have been enqueued that even making
> reasonable progress isn't going to beat the timeout.  I believe I've only
> seen this when running heavy loads in addition to rcutorture (though I've
> done more testing under load than without); I don't know whether the
> forward progress tests are expected to work under such load.

With current -rcu, the torture tests are ahead of RCU's forward-progress
code, so rcu_torture_fwd_prog_cr() failures are expected behavior,
particularly in the TREE04 and TREE07 scenarios.  This is more of a
problem for real-time because of callback offloading, which removes the
backpressure that normally exists from callback processing to whatever
is running on that same CPU generating so many callbacks.  So this issue
has been providing me some entertainment.  ;-)

If you put lots of load on the system while running rcutorture, but
leave the CPU running rcu_torture_fwd_prog_cr() otherwise idle, then yes,
you can eventually force rcu_torture_fwd_prog_cr() pretty much no matter
what RCU does otherwise.  Particularly given that rcutorture is running
within a guest OS.  There has been some discussion of RCU asking for help
from the hypervisor, but it hasn't yet gotten further than discussion.

For another example, if all but one of the CPUs is an no-CBs CPU
(or, equivalently, a nohz_full CPU), and all of the rcuo kthreads
are constrained to run on the remaining CPU, it is not hard to create
workloads that produce more callbacks than that remaining CPU can possibly
keep up with.  The traditional position has of course been the Spiderman
principle "With great power comes great responsibility".  ;-)

							Thanx, Paul

> Scott Wood (4):
>   rcu: Acquire RCU lock when disabling BHs
>   sched: migrate_enable: Use sleeping_lock to indicate involuntary sleep
>   rcu: unlock special: Treat irq and preempt disabled the same
>   rcutorture: Avoid problematic critical section nesting
> 
>  include/linux/rcupdate.h |  4 +++
>  include/linux/sched.h    |  4 +--
>  kernel/rcu/rcutorture.c  | 92 ++++++++++++++++++++++++++++++++++++++++--------
>  kernel/rcu/tree_plugin.h | 12 ++-----
>  kernel/rcu/update.c      |  4 +++
>  kernel/sched/core.c      |  2 ++
>  kernel/softirq.c         | 12 +++++--
>  7 files changed, 102 insertions(+), 28 deletions(-)
> 
> -- 
> 1.8.3.1
>