[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87lghzta1i.fsf@yhuang-dev.intel.com>
Date: Tue, 19 Dec 2017 16:08:09 +0800
From: "Huang\, Ying" <ying.huang@...el.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, Hugh Dickins <hughd@...gle.com>,
Minchan Kim <minchan@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Shaohua Li <shli@...com>,
Mel Gorman <mgorman@...hsingularity.net>,
JXrXme Glisse <jglisse@...hat.com>,
"Michal Hocko" <mhocko@...e.com>,
Andrea Arcangeli <aarcange@...hat.com>,
"David Rientjes" <rientjes@...gle.com>,
Rik van Riel <riel@...hat.com>, Jan Kara <jack@...e.cz>,
Dave Jiang <dave.jiang@...el.com>,
Aaron Lu <aaron.lu@...el.com>
Subject: Re: [PATCH -V3 -mm] mm, swap: Fix race between swapoff and some swap operations
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> writes:
> On Tue, Dec 19, 2017 at 09:57:21AM +0800, Huang, Ying wrote:
>> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> writes:
>>
>> > On Mon, Dec 18, 2017 at 03:41:41PM +0800, Huang, Ying wrote:
>> >> "Huang, Ying" <ying.huang@...el.com> writes:
>> >> And, it appears that if we replace smp_wmb() in _enable_swap_info() with
>> >> stop_machine() in some way, we can avoid smp_rmb() in get_swap_device().
>> >> This can reduce overhead in normal path further. Can we get same effect
>> >> with RCU? For example, use synchronize_rcu() instead of stop_machine()?
>> >>
>> >> Hi, Paul, can you help me on this?
>> >
>> > If the key loads before and after the smp_rmb() are within the same
>> > RCU read-side critical section, -and- if one of the critical writes is
>> > before the synchronize_rcu() and the other critical write is after the
>> > synchronize_rcu(), then you normally don't need the smp_rmb().
>> >
>> > Otherwise, you likely do still need the smp_rmb().
>>
>> My question may be too general, let make it more specific. For the
>> following program,
>>
>> "
>> int a;
>> int b;
>>
>> void intialize(void)
>> {
>> a = 1;
>> synchronize_rcu();
>> b = 2;
>> }
>>
>> void test(void)
>> {
>> int c;
>>
>> rcu_read_lock();
>> c = b;
>> /* ignored smp_rmb() */
>> if (c)
>> pr_info("a=%d\n", a);
>> rcu_read_unlock();
>> }
>> "
>>
>> Is it possible for it to show
>>
>> "
>> a=0
>> "
>>
>> in kernel log?
>>
>>
>> If it couldn't, this could be a useful usage model of RCU to accelerate
>> hot path.
>
> This is not possible, and it can be verified using the Linux kernel
> memory model. An introduction to an older version of this model may
> be found here (including an introduction to litmus tests and their
> output):
>
> https://lwn.net/Articles/718628/
> https://lwn.net/Articles/720550/
>
> The litmus test and its output are shown below.
>
> The reason it is not possible is that the entirety of test()'s RCU
> read-side critical section must do one of two things:
>
> 1. Come before the return from initialize()'s synchronize_rcu().
> 2. Come after the call to initialize()'s synchronize_rcu().
>
> Suppose test()'s load from "b" sees initialize()'s assignment. Then
> some part of test()'s RCU read-side critical section came after
> initialize()'s call to synchronize_rcu(), which means that the entirety
> of test()'s RCU read-side critical section must come after initialize()'s
> call to synchronize_rcu(). Therefore, whenever "c" is non-zero, the
> pr_info() must see "a" non-zero.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> C MP-o-sync-o+rl-o-ctl-o-rul
>
> {}
>
> P0(int *a, int *b)
> {
> WRITE_ONCE(*a, 1);
> synchronize_rcu();
> WRITE_ONCE(*b, 2);
> }
>
> P1(int *a, int *b)
> {
> int r0;
> int r1;
>
> rcu_read_lock();
> r0 = READ_ONCE(*b);
> if (r0)
> r1 = READ_ONCE(*a);
> rcu_read_unlock();
> }
>
> exists (1:r0=1 /\ 1:r1=0)
>
> ------------------------------------------------------------------------
>
> States 2
> 1:r0=0; 1:r1=0;
> 1:r0=2; 1:r1=1;
> No
> Witnesses
> Positive: 0 Negative: 2
> Condition exists (1:r0=1 /\ 1:r1=0)
> Observation MP-o-sync-o+rl-o-ctl-o-rul Never 0 2
> Time MP-o-sync-o+rl-o-ctl-o-rul 0.01
> Hash=b20eca2da50fa84b15e489502420ff56
>
> ------------------------------------------------------------------------
>
> The "Never 0 2" means that the condition cannot happen.
Thanks a lot for your detailed explanation! That helps me much!
Best Regards,
Huang, Ying
Powered by blists - more mailing lists