lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48B1A77E.5070504@colorfullife.com>
Date:	Sun, 24 Aug 2008 20:25:02 +0200
From:	Manfred Spraul <manfred@...orfullife.com>
To:	paulmck@...ux.vnet.ibm.com
CC:	linux-kernel@...r.kernel.org, cl@...ux-foundation.org,
	mingo@...e.hu, akpm@...ux-foundation.org, dipankar@...ibm.com,
	josht@...ux.vnet.ibm.com, schamp@....com, niv@...ibm.com,
	dvhltc@...ibm.com, ego@...ibm.com, laijs@...fujitsu.com,
	rostedt@...dmis.org
Subject: Re: [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation

Paul E. McKenney wrote:
>>> + */
>>> +struct rcu_node {
>>> +	spinlock_t lock;
>>> +	unsigned long	qsmask;	/* CPUs or groups that need to switch in      */
>>> +				/*  order for current grace period to proceed.*/
>>> +	unsigned long	qsmaskinit;
>>> +				/* Per-GP initialization for qsmask.	      */
>>>   
>>>       
>> I'm not sure if a bitmap is the right storage. If I understand the code 
>> correctly, it contains two information:
>> 1) If the bitmap is clear, then all cpus have completed whatever they need 
>> to do.
>> A counter is more efficient than a bitmap. Especially: It would allow to 
>> choose the optimal fan-out, independent from 32/64 bits.
>> 2) The information if the current cpu must do something to complete the 
>> current period.non
>> This is a local information, usually (always?) only the current cpu needs 
>> to know if it must do something.
>> But this doesn't need to be stored in a shared structure, the information 
>> could be stored in a per-cpu structure.
>>     
>
> I am using the bitmap in force_quiescent_state() to work out who to
> check dynticks and who to send reschedule IPIs to.  I could scan all
> of the per-CPU rcu_data structures, but am assuming that after a few
> jiffies there would typically be relatively few CPUs still needing to do
> a quiescent state.  Given this assumption, on systems with large numbers
> of CPUs, scanning the bitmask greatly reduces the number of cache misses
> compared to scanning the rcu_data structures.
>
>   
It's an optimization question: What is rarer? force_quiescent_state() or 
"normal" cpu_quiet calls.
You have optimized for force_quiescent_state(), I have optimized for 
"normal" cpu_quiet calls. [ok, I admit: force_quiescent_state() is still 
missing in my code].
Do you have any statistics?

--
    Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ