linux-kernel - real-time preemption and RCU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <38569.31214.qm@web83816.mail.sp1.yahoo.com>
Date:	Thu, 11 Jun 2009 15:57:52 -0700 (PDT)
From:	James Huang <jamesclhuang@...oo.com>
To:	linux-kernel@...r.kernel.org, paulmck@...ux.vnet.ibm.com
Subject: real-time preemption and RCU

Hi Paul,

        I have read through your year 2005 document on real-time preemption and RCU.
It was very interesting and your approach to the problem (by gradual improvement in each new implementation) make the idea very clear.
However, I am baffled by the following potential race condition that exists in implementation 2 through 5.
To keep the case simple, let's choose implementation 2 to illustrate:

CPU0           |<-- delete M1 -->|             ||           |<---- delete M2 --->|        <------  delete M3 ---->|
                                                          || 
                                                          ||
CPU1      |<-----   read M1--------->|         ||    |<-------------------------    read M2  --------------------------------------->|
                                                          ||     
                                                          ||
CPU2                                             time T: execute synchronize_kernel: rcu_ctrlblk.batch++                

Assume initially 

rcu_data[cpu0].batch = 1
rcu_data[cpu1].batch = 1
rcu_data[cpu2].batch = 1
rcu_ctrlblk.batch = 1

The following steps are executed:
(1)  cpu1 read-locked rcu_ctrlblk.lock, read M1, read-unlocked rcu_ctrlblk.lock
(2)  cpu0 deleted M1
(3)  At time T (marked by || ), cpu2 executed synchronize_kernel: write-locked rcu_ctrlblk.lock, incremented rcu_ctrlblk.batch to 2, and write-unlocked rcu_ctrlblk.lock
(4) cpu1 read-locked rcu_ctrlblk.lock, spent a long time in its rcu read-side critical section, read M2, read-unlocked rcu_ctrlblk.lock
(5) cpu0 deleted M2.  But when it executed run_rcu(), cpu0 DID NOT see the most up-to-date value of rcu_ctrlblk.batch.
     So cpu0 just inserted M2 into cpu0's waitlist, but did not free up M1 and did not update rcu_data[cpu0].batch (i.e. it was still equal to 1).
(6) cpu0 deleted M3. At this time cpu0 saw the most up-to-date value of rcu_ctrlblk.batch (2).
     Since rcu_ctrlblk.batch (2) is larger than rcu_data[cpu0].batch (1), cpu0 freed up memory blocks in its waitlist.
     So both M1 and M2 were freed up by cpu0.  But if cpu1 was still accessing M2, this will be a problem.

Am I missing something here?  Does the smp_mb() within run_do_my_batch() has anything to do with this issue?

-- James Huang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/