lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 May 2015 17:09:27 +0200
From:	Denys Vlasenko <dvlasenk@...hat.com>
To:	Steven Rostedt <rostedt@...dmis.org>
CC:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Josh Triplett <josh@...htriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
	Oleg Nesterov <oleg@...hat.com>
Subject: Re: [PATCH] rcu: Deinline rcu_read_lock_sched_held() if DEBUG_LOCK_ALLOC

On 05/21/2015 03:45 PM, Steven Rostedt wrote:
> On Thu, 21 May 2015 12:04:07 +0200
> Denys Vlasenko <dvlasenk@...hat.com> wrote:
> 
>> DEBUG_LOCK_ALLOC=y is not a production setting, but it is
>> not very unusual either. Many developers routinely
>> use kernels built with it enabled.
>>
>> Apart from being selected by hand, it is also auto-selected by
>> PROVE_LOCKING "Lock debugging: prove locking correctness" and
>> LOCK_STAT "Lock usage statistics" config options.
>> LOCK STAT is necessary for "perf lock" to work.
>>
>> I wouldn't spend too much time optimizing it, but this particular
>> function has a very large cost in code size: when it is deinlined,
>> code size decreases by 830,000 bytes:
>>
>>     text     data      bss       dec     hex filename
>> 85674192 22294776 20627456 128596424 7aa39c8 vmlinux.before
>> 84837612 22294424 20627456 127759492 79d7484 vmlinux
>>
>> (with this config: http://busybox.net/~vda/kernel_config)
>>
>> Signed-off-by: Denys Vlasenko <dvlasenk@...hat.com>
>> CC: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
>> CC: Josh Triplett <josh@...htriplett.org>
>> CC: Steven Rostedt <rostedt@...dmis.org>
>> CC: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>> CC: Lai Jiangshan <laijs@...fujitsu.com>
>> CC: linux-kernel@...r.kernel.org
>> CC: Tejun Heo <tj@...nel.org>
>> CC: Oleg Nesterov <oleg@...hat.com>
>> ---
>>  include/linux/rcupdate.h | 40 ++-----------------------------------
>>  kernel/rcu/update.c      | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 54 insertions(+), 38 deletions(-)
>>
>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
>> index 7809749..6024a65 100644
>> --- a/include/linux/rcupdate.h
>> +++ b/include/linux/rcupdate.h
>> @@ -439,46 +439,10 @@ int rcu_read_lock_bh_held(void);
>>   * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
>>   * RCU-sched read-side critical section.  In absence of
>>   * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
>> - * critical section unless it can prove otherwise.  Note that disabling
>> - * of preemption (including disabling irqs) counts as an RCU-sched
>> - * read-side critical section.  This is useful for debug checks in functions
>> - * that required that they be called within an RCU-sched read-side
>> - * critical section.
>> - *
>> - * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
>> - * and while lockdep is disabled.
>> - *
>> - * Note that if the CPU is in the idle loop from an RCU point of
>> - * view (ie: that we are in the section between rcu_idle_enter() and
>> - * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU
>> - * did an rcu_read_lock().  The reason for this is that RCU ignores CPUs
>> - * that are in such a section, considering these as in extended quiescent
>> - * state, so such a CPU is effectively never in an RCU read-side critical
>> - * section regardless of what RCU primitives it invokes.  This state of
>> - * affairs is required --- we need to keep an RCU-free window in idle
>> - * where the CPU may possibly enter into low power mode. This way we can
>> - * notice an extended quiescent state to other CPUs that started a grace
>> - * period. Otherwise we would delay any grace period as long as we run in
>> - * the idle task.
>> - *
>> - * Similarly, we avoid claiming an SRCU read lock held if the current
>> - * CPU is offline.
>> + * critical section unless it can prove otherwise.
>>   */
>>  #ifdef CONFIG_PREEMPT_COUNT
>> -static inline int rcu_read_lock_sched_held(void)
>> -{
>> -	int lockdep_opinion = 0;
>> -
>> -	if (!debug_lockdep_rcu_enabled())
>> -		return 1;
>> -	if (!rcu_is_watching())
>> -		return 0;
>> -	if (!rcu_lockdep_current_cpu_online())
>> -		return 0;
>> -	if (debug_locks)
>> -		lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
>> -	return lockdep_opinion || preempt_count() != 0 || irqs_disabled();
>> -}
>> +int rcu_read_lock_sched_held(void);
>>  #else /* #ifdef CONFIG_PREEMPT_COUNT */
>>  static inline int rcu_read_lock_sched_held(void)
>>  {
>> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
>> index e0d31a3..e02218f 100644
>> --- a/kernel/rcu/update.c
>> +++ b/kernel/rcu/update.c
>> @@ -62,6 +62,58 @@ MODULE_ALIAS("rcupdate");
>>  
>>  module_param(rcu_expedited, int, 0);
>>  
>> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
>> +/**
>> + * rcu_read_lock_sched_held() - might we be in RCU-sched read-side critical section?
>> + *
>> + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
>> + * RCU-sched read-side critical section.  In absence of
>> + * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
>> + * critical section unless it can prove otherwise.  Note that disabling
>> + * of preemption (including disabling irqs) counts as an RCU-sched
>> + * read-side critical section.  This is useful for debug checks in functions
>> + * that required that they be called within an RCU-sched read-side
>> + * critical section.
>> + *
>> + * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
>> + * and while lockdep is disabled.
>> + *
>> + * Note that if the CPU is in the idle loop from an RCU point of
>> + * view (ie: that we are in the section between rcu_idle_enter() and
>> + * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU
>> + * did an rcu_read_lock().  The reason for this is that RCU ignores CPUs
>> + * that are in such a section, considering these as in extended quiescent
>> + * state, so such a CPU is effectively never in an RCU read-side critical
>> + * section regardless of what RCU primitives it invokes.  This state of
>> + * affairs is required --- we need to keep an RCU-free window in idle
>> + * where the CPU may possibly enter into low power mode. This way we can
>> + * notice an extended quiescent state to other CPUs that started a grace
>> + * period. Otherwise we would delay any grace period as long as we run in
>> + * the idle task.
>> + *
>> + * Similarly, we avoid claiming an SRCU read lock held if the current
>> + * CPU is offline.
>> + */
>> +#ifdef CONFIG_PREEMPT_COUNT
>> +int rcu_read_lock_sched_held(void)
>> +{
>> +	int lockdep_opinion = 0;
>> +
>> +	if (!debug_lockdep_rcu_enabled())
>> +		return 1;
>> +	if (!rcu_is_watching())
>> +		return 0;
>> +	if (!rcu_lockdep_current_cpu_online())
>> +		return 0;
>> +	if (debug_locks)
>> +		lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
>> +	return lockdep_opinion || preempt_count() != 0 || irqs_disabled();
>> +}
>> +EXPORT_SYMBOL(rcu_read_lock_sched_held);
>> +#else
>> +/* !CONFIG_PREEMPT_COUNT - the function is inlined to always return 1 */
> 
> Nuke the #else. It's not needed and this is a common enough practice to
> have the static inline foo() { } when disabled that we do not need to
> comment about it here.

Sending patch v2 in a few minutes.

>> +#endif
> 
> Hmm, you added two #ifdef, and one #endif. How does this even compile??

Er... it... doesn't.

There was "#if defined CONFIG_DEBUG_LOCK_ALLOC && defined CONFIG_PREEMPT_COUNT"
but then I split it into two #ifs to have a nice explanatory empty
#else clause. And I did not test it after that edit
("what could possibly go wrong?"). Sorry.

Patch v2 I'm sending _is_ tested.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ