lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121004213933.GI2464@linux.vnet.ibm.com>
Date:	Thu, 4 Oct 2012 14:39:34 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	ananaza@....fi
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcu: Add "expedite always" switch

On Fri, Oct 05, 2012 at 12:00:53AM +0300, Antti P Miettinen wrote:
> Add a module parameter for always using expedited RCU primitives.

Cool, looks quite close!!!  Please see below for a couple of questions and
comments.

> Signed-off-by: Antti P Miettinen <amiettinen@...dia.com>
> ---
>  include/linux/rcupdate.h |    2 ++
>  kernel/rcupdate.c        |    4 ++++
>  kernel/rcutiny_plugin.h  |    5 ++++-
>  kernel/rcutree.c         |   10 ++++++++--
>  kernel/rcutree_plugin.h  |    5 ++++-
>  kernel/srcu.c            |    4 +++-
>  6 files changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 7c968e4..b37efae 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -45,6 +45,8 @@
>  #include <linux/bug.h>
>  #include <linux/compiler.h>
> 
> +extern int rcu_expedited;
> +

This needs to move to kernel/rcu.h, otherwise it is being exported to
the entire kernel.  Don't get me wrong, if someone in the kernel needs
to be able switch to and from expedited mode, I am OK with their doing
that -- but via a proper API rather than a bare store to a variable.  ;-)

(That said, the networking guys are doing just fine by having
synchronize_net() call either synchronize_rcu_expedited() or
synchronize_rcu(), depending on internal networking locking state.
So I do not believe that we need an API, not yet, anyway.)

>  #ifdef CONFIG_RCU_TORTURE_TEST
>  extern int rcutorture_runnable; /* for sysctl */
>  #endif /* #ifdef CONFIG_RCU_TORTURE_TEST */
> diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
> index 29ca1c6..6057e58 100644
> --- a/kernel/rcupdate.c
> +++ b/kernel/rcupdate.c
> @@ -46,12 +46,16 @@
>  #include <linux/export.h>
>  #include <linux/hardirq.h>
>  #include <linux/delay.h>
> +#include <linux/module.h>
> 
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/rcu.h>
> 
>  #include "rcu.h"
> 
> +int rcu_expedited;
> +module_param(rcu_expedited, int, 0644);

Ah, the joys of exported-to-userspace API design...  ;-)

Summary: Please s/0644/0/ and add a KERNEL_ATTR_RW()-style sysfs interface.

If you want the details, please read on!

This exports the ability to switch expedited mode on and off at boot
time and also via sysfs.  However, this approach to the sysfs interface
does not gracefully handle multiple things enabling and disabling
expedited mode concurrently.  There are a couple of possible approaches:

1.	Look at current use cases.  These seem to be networking
	applications such as wireshark that reportedly cause the kernel
	to do a large number of RCU grace periods when starting up.
	These would be best served by an enter/exit style that
	increments and decrements a counter.

2.	Of course, applications can abort without properly cleaning
	up, which would argue for some sort of open/close protocol
	relying on the fact that all the application's fds are
	closed on any sort of exit.  But that leaves scripts out.

3.	One way to handle the scripts is to use redirection tricks,
	opening (say) FD 9 to the device.

4.	Do the bare minimum initially, so that when/if other use
	cases appear, they can be accommodated -- but using real
	information about what they really need.

>From what I can see, #4 is the only approach with a decent chance.
So, what is the bare minimum?

a.	Boot-time setting, allowing small embedded systems to
	unconditionally speed up grace periods.  You have this with the
	module_param() above.  Setting the permissions to zero would
	disable the userspace direct-set ability that is likely to get
	us into backwards-compatibility hot water in the future.

b.	Maybe sysfs setting.  Initially, this can be simple "on" and "off",
	exported with root-only access (like you have above).  If more
	elaborate use cases appear, this might become the adminstrative
	override.  The key point is that this needs to go through some
	sort of separate function (for example, via the KERNEL_ATTR_RW()
	macro) so that if it needs to change in the future, we can do
	the change easily.

Or am I missing something?

The rest looks good!

							Thanx, Paul

> +
>  #ifdef CONFIG_PREEMPT_RCU
> 
>  /*
> diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
> index 3d01902..f85016a 100644
> --- a/kernel/rcutiny_plugin.h
> +++ b/kernel/rcutiny_plugin.h
> @@ -706,7 +706,10 @@ void synchronize_rcu(void)
>  		return;
> 
>  	/* Once we get past the fastpath checks, same code as rcu_barrier(). */
> -	rcu_barrier();
> +	if (rcu_expedited)
> +		synchronize_rcu_expedited();
> +	else
> +		rcu_barrier();
>  }
>  EXPORT_SYMBOL_GPL(synchronize_rcu);
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 4fb2376..744c117 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2221,7 +2221,10 @@ void synchronize_sched(void)
>  			   "Illegal synchronize_sched() in RCU-sched read-side critical section");
>  	if (rcu_blocking_is_gp())
>  		return;
> -	wait_rcu_gp(call_rcu_sched);
> +	if (rcu_expedited)
> +		synchronize_sched_expedited();
> +	else
> +		wait_rcu_gp(call_rcu_sched);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_sched);
> 
> @@ -2242,7 +2245,10 @@ void synchronize_rcu_bh(void)
>  			   "Illegal synchronize_rcu_bh() in RCU-bh read-side critical section");
>  	if (rcu_blocking_is_gp())
>  		return;
> -	wait_rcu_gp(call_rcu_bh);
> +	if (rcu_expedited)
> +		synchronize_rcu_bh_expedited();
> +	else
> +		wait_rcu_gp(call_rcu_bh);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_rcu_bh);
> 
> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index f921154..0e9ca8b 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -679,7 +679,10 @@ void synchronize_rcu(void)
>  			   "Illegal synchronize_rcu() in RCU read-side critical section");
>  	if (!rcu_scheduler_active)
>  		return;
> -	wait_rcu_gp(call_rcu);
> +	if (rcu_expedited)
> +		synchronize_rcu_expedited();
> +	else
> +		wait_rcu_gp(call_rcu);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_rcu);
> 
> diff --git a/kernel/srcu.c b/kernel/srcu.c
> index 97c465e..00c4169 100644
> --- a/kernel/srcu.c
> +++ b/kernel/srcu.c
> @@ -464,7 +464,9 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount)
>   */
>  void synchronize_srcu(struct srcu_struct *sp)
>  {
> -	__synchronize_srcu(sp, SYNCHRONIZE_SRCU_TRYCOUNT);
> +	__synchronize_srcu(sp, rcu_expedited
> +			   ? SYNCHRONIZE_SRCU_EXP_TRYCOUNT
> +			   : SYNCHRONIZE_SRCU_TRYCOUNT);
>  }
>  EXPORT_SYMBOL_GPL(synchronize_srcu);
> 
> -- 
> 1.7.4.1
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ