lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 11 Mar 2018 15:55:41 +0800
From:   焦晓冬 <milestonejxd@...il.com>
To:     linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, stern@...land.harvard.edu,
        will.deacon@....com, torvalds@...ux-foundation.org,
        npiggin@...il.com, mingo@...nel.org, mpe@...erman.id.au,
        oleg@...hat.com, benh@...nel.crashing.org,
        paulmck@...ux.vnet.ibm.com
Subject: smp_mb__after_spinlock requirement too strong?

Peter pointed out in this patch https://patchwork.kernel.org/patch/9771921/
that the spinning-lock used at __schedule() should be RCsc to ensure
visibility of writes prior to __schedule when the task is to be migrated to
another CPU.

And this is emphasized at the comment of the newly introduced
smp_mb__after_spinlock(),

 * This barrier must provide two things:
 *
 *   - it must guarantee a STORE before the spin_lock() is ordered against a
 *     LOAD after it, see the comments at its two usage sites.
 *
 *   - it must ensure the critical section is RCsc.
 *
 * The latter is important for cases where we observe values written by other
 * CPUs in spin-loops, without barriers, while being subject to scheduling.
 *
 * CPU0         CPU1            CPU2
 *
 *          for (;;) {
 *            if (READ_ONCE(X))
 *              break;
 *          }
 * X=1
 *          <sched-out>
 *                      <sched-in>
 *                      r = X;
 *
 * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
 * we get migrated and CPU2 sees X==0.

which is used at,

__schedule(bool preempt) {
    ...
    rq_lock(rq, &rf);
    smp_mb__after_spinlock();
    ...
}
.

If I didn't miss something, I found this kind of visibility is __not__
necessarily
depends on the spinning-lock at __schedule being RCsc.

In fact, as for runnable task A, the migration would be,

 CPU0         CPU1            CPU2

<ACCESS before schedule out A>

lock(rq0)
schedule out A
unock(rq0)

              lock(rq0)
              remove A from rq0
              unlock(rq0)

              lock(rq2)
              add A into rq2
              unlock(rq2)
                                        lock(rq2)
                                        schedule in A
                                        unlock(rq2)

                                        <ACCESS after schedule in A>

<ACCESS before schedule out A> happens-before
unlock(rq0) happends-before
lock(rq0) happends-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>

And for stopped tasks,

 CPU0         CPU1            CPU2

<ACCESS before schedule out A>

lock(rq0)
schedule out A
remove A from rq0
store-release(A->on_cpu)
unock(rq0)

              load_acquire(A->on_cpu)
              set_task_cpu(A, 2)

              lock(rq2)
              add A into rq2
              unlock(rq2)

                                        lock(rq2)
                                        schedule in A
                                        unlock(rq2)

                                        <ACCESS after schedule in A>

<ACCESS before schedule out A> happens-before
store-release(A->on_cpu)  happens-before
load_acquire(A->on_cpu)  happens-before
unlock(rq2) happens-before
lock(rq2) happens-before
<ACCESS after schedule in A>

So, I think the only requirement to smp_mb__after_spinlock is
to guarantee a STORE before the spin_lock() is ordered
against a LOAD after it. So we could remove the RCsc requirement
to allow more efficient implementation.

Did I miss something or this RCsc requirement does not really matter?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ