lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190820014015.GA199862@google.com>
Date:   Mon, 19 Aug 2019 21:40:15 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Scott Wood <swood@...hat.com>
Cc:     linux-kernel@...r.kernel.org,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>, rcu@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC v2] rcu/tree: Try to invoke_rcu_core() if in_irq() during
 unlock

On Mon, Aug 19, 2019 at 07:14:38PM -0500, Scott Wood wrote:
> On Sun, 2019-08-18 at 17:49 -0400, Joel Fernandes (Google) wrote:
> > When we're in hard interrupt context in rcu_read_unlock_special(), we
> > can still benefit from invoke_rcu_core() doing wake ups of rcuc
> > threads when the !use_softirq parameter is passed.  This is safe
> > to do so because:
> 
> What is the benefit, beyond skipping the irq work overhead?  Is there some
> reason to specifically want the rcuc thread woken rather than just getting
> into the scheduler (and thus rcu_note_context_switch) as soon as possible?

Isn't skipping irq work overhead enough of a benefit?

Anyway, I think it is useful in this scenario:
 Consider exp==true when the rcu_read_unlock() is done on a nohz_full CPU.

 If you simply 'get into the scheduler' as you pointed, that is not enough to
 end the grace period. The quiescent state has to be reported up the tree and
 propagated to the root node in the tree. This happens only in 2 places:
 	1. The scheduler tick raising softirq, the end of which will execute the
	   RCU core from the softirq or do the invoke_rcu_core().
	2. The FQS loop which needs to see a dyntick idle transition on the
	   CPU (usermode/idle to kernel or viceversa).

Case 1. is unlikely since the tick may be turned off but I worked last week
        with Paul on turning it on and is doing better.
Case 2. is not happening if we're looping in kernel mode.

In this scenario, calling invoke_rcu_core() directly is better than
scheduling the IRQ work. I don't think the IRQ work will do anything for
nohz_full CPU but I am not sure about that.

To give more background about why I arrived at this patch, I noticed that
this call to invoke_rcu_core() was already being done but it was removed
because the commit removing it said that it is pointless as it does not do
anything. But I think it does do something, that's why I introduced it back.
The rcu_read_unlock_special() is a slow path anyway so one more branch should
be harmless and actually could be beneficial. However, this is just RFC,
please treat it as such. I am running more tests on it based on Paul's
suggestions and looking more closely at it tomorrow.

Thanks!

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ