[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48B1A77E.5070504@colorfullife.com>
Date: Sun, 24 Aug 2008 20:25:02 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: paulmck@...ux.vnet.ibm.com
CC: linux-kernel@...r.kernel.org, cl@...ux-foundation.org,
mingo@...e.hu, akpm@...ux-foundation.org, dipankar@...ibm.com,
josht@...ux.vnet.ibm.com, schamp@....com, niv@...ibm.com,
dvhltc@...ibm.com, ego@...ibm.com, laijs@...fujitsu.com,
rostedt@...dmis.org
Subject: Re: [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation
Paul E. McKenney wrote:
>>> + */
>>> +struct rcu_node {
>>> + spinlock_t lock;
>>> + unsigned long qsmask; /* CPUs or groups that need to switch in */
>>> + /* order for current grace period to proceed.*/
>>> + unsigned long qsmaskinit;
>>> + /* Per-GP initialization for qsmask. */
>>>
>>>
>> I'm not sure if a bitmap is the right storage. If I understand the code
>> correctly, it contains two information:
>> 1) If the bitmap is clear, then all cpus have completed whatever they need
>> to do.
>> A counter is more efficient than a bitmap. Especially: It would allow to
>> choose the optimal fan-out, independent from 32/64 bits.
>> 2) The information if the current cpu must do something to complete the
>> current period.non
>> This is a local information, usually (always?) only the current cpu needs
>> to know if it must do something.
>> But this doesn't need to be stored in a shared structure, the information
>> could be stored in a per-cpu structure.
>>
>
> I am using the bitmap in force_quiescent_state() to work out who to
> check dynticks and who to send reschedule IPIs to. I could scan all
> of the per-CPU rcu_data structures, but am assuming that after a few
> jiffies there would typically be relatively few CPUs still needing to do
> a quiescent state. Given this assumption, on systems with large numbers
> of CPUs, scanning the bitmask greatly reduces the number of cache misses
> compared to scanning the rcu_data structures.
>
>
It's an optimization question: What is rarer? force_quiescent_state() or
"normal" cpu_quiet calls.
You have optimized for force_quiescent_state(), I have optimized for
"normal" cpu_quiet calls. [ok, I admit: force_quiescent_state() is still
missing in my code].
Do you have any statistics?
--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists