[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJDTihwMgUqjK6YL2M_WZY4VBaifVYgNvdZOF_fNrnk_p27Kvw@mail.gmail.com>
Date: Sun, 11 Mar 2018 15:55:41 +0800
From: 焦晓冬 <milestonejxd@...il.com>
To: linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, stern@...land.harvard.edu,
will.deacon@....com, torvalds@...ux-foundation.org,
npiggin@...il.com, mingo@...nel.org, mpe@...erman.id.au,
oleg@...hat.com, benh@...nel.crashing.org,
paulmck@...ux.vnet.ibm.com
Subject: smp_mb__after_spinlock requirement too strong?
Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
that the spinning-lock used at __schedule() should be RCsc to ensure
visibility of writes prior to __schedule when the task is to be migrated to
another CPU.
And this is emphasized at the comment of the newly introduced
smp_mb__after_spinlock(),
* This barrier must provide two things:
*
* - it must guarantee a STORE before the spin_lock() is ordered against a
* LOAD after it, see the comments at its two usage sites.
*
* - it must ensure the critical section is RCsc.
*
* The latter is important for cases where we observe values written by other
* CPUs in spin-loops, without barriers, while being subject to scheduling.
*
* CPU0 CPU1 CPU2
*
* for (;;) {
* if (READ_ONCE(X))
* break;
* }
* X=1
* <sched-out>
* <sched-in>
* r = X;
*
* without transitivity it could be that CPU1 observes X!=0 breaks the loop,
* we get migrated and CPU2 sees X==0.
which is used at,
__schedule(bool preempt) {
...
rq_lock(rq, &rf);
smp_mb__after_spinlock();
...
}
.
If I didn't miss something, I found this kind of visibility is __not__
necessarily
depends on the spinning-lock at __schedule being RCsc.
In fact, as for runnable task A, the migration would be,
CPU0 CPU1 CPU2
<ACCESS before schedule out A>
lock(rq0)
schedule out A
unock(rq0)
lock(rq0)
remove A from rq0
unlock(rq0)
lock(rq2)
add A into rq2
unlock(rq2)
lock(rq2)
schedule in A
unlock(rq2)
<ACCESS after schedule in A>
<ACCESS before schedule out A> happens-before
unlock(rq0) happends-before
lock(rq0) happends-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>
And for stopped tasks,
CPU0 CPU1 CPU2
<ACCESS before schedule out A>
lock(rq0)
schedule out A
remove A from rq0
store-release(A->on_cpu)
unock(rq0)
load_acquire(A->on_cpu)
set_task_cpu(A, 2)
lock(rq2)
add A into rq2
unlock(rq2)
lock(rq2)
schedule in A
unlock(rq2)
<ACCESS after schedule in A>
<ACCESS before schedule out A> happens-before
store-release(A->on_cpu) happens-before
load_acquire(A->on_cpu) happens-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>
So, I think the only requirement to smp_mb__after_spinlock is
to guarantee a STORE before the spin_lock() is ordered
against a LOAD after it. So we could remove the RCsc requirement
to allow more efficient implementation.
Did I miss something or this RCsc requirement does not really matter?
Powered by blists - more mailing lists