linux-kernel - Re: [PATCH, RFC] v4 scalable classic RCU implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48D62B7F.10905@colorfullife.com>
Date:	Sun, 21 Sep 2008 13:09:51 +0200
From:	Manfred Spraul <manfred@...orfullife.com>
To:	paulmck@...ux.vnet.ibm.com
CC:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH, RFC] v4 scalable classic RCU implementation

Hi Paul,

Some further thoughts about design differences between your and my 
implementation:

- rcutree's qsmaskinit  is the worst-case list of cpus that could be in 
rcu read side critical sections.
- rcustate's cpu_total is the accurate list of cpus that could be in rcu 
read side critical sections.

Both variables are read rarely: for rcu_state, twice per grace period.

rcutree fixes up cpus that are "incorrectly" listed in qsmaskinit with 
force_quiescent_state(). It forces rcutree to use a cpu bitmask for 
qsmask and it forces rcutree to store the "done" information in a global 
structure. Additionately, in the worst case force_quiescent_state() must 
loop over all cpus.
rcustate can use per-cpu structures and a global atomic_t. There is no 
loop over all cpus. That's a big advantage, thus I think it's worth the 
effort to maintain an accurate list.
Unfortunately, I don't have an efficient implementation for the accurate 
list.

Some random ideas:
- cpu_total is only read rarely. Thus it would be ok if the read 
operation is expensive [e.g. collect data from multiple cachelines, 
acquire spinlocks...]
- updates to cpu_total happen with every interrupt on an idle system 
with no_hz.
    Thus it must be very scalable, preferably per-cpu data.
    And: Updates are far more frequent than grace periods.
- updates to cpu_total happen nearly never without no_hz.
   Especially: far less frequent than grace periods.

What about adding an "invalid" flag to cpu_total? The "real" data is 
stored in per-cpu structures.
- when a cpu enters/leaves nohz, then it invalidates the global 
cpu_total and updates a per-cpu structure
- when the state machine needs the number of rcu-tracked cpus, then it 
checks if the global cpu_total is valid.
If it's valid, then cpu_total is used directly. Otherwise the per-cpu 
structures are enumerated and the new value is stored as cpu_total.

What do you think?

--
    Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/