lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <19A03A16-9F38-481F-824D-4C883AB8B105@joelfernandes.org>
Date:   Thu, 22 Dec 2022 08:03:39 -0500
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        "Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>, neeraj.iitr10@...il.com
Subject: Re: [PATCH] srcu: Remove memory barrier "E" as it does not do anything



> On Dec 22, 2022, at 7:03 AM, Frederic Weisbecker <frederic@...nel.org> wrote:
> 
> On Thu, Dec 22, 2022 at 12:20:11AM +0000, Joel Fernandes (Google) wrote:
>> During a flip, we have a full memory barrier before srcu_idx is incremented.
>> 
>> The idea is we intend to order the first phase scan's read of lock
>> counters with the flipping of the index.
>> 
>> However, such ordering is already enforced because of the
>> control-dependency between the 2 scans. We would be flipping the index
>> only if lock and unlock counts matched.
>> 
>> But such match will not happen if there was a pending reader before the flip
>> in the first place (observation courtesy Mathieu Desnoyers).
>> 
>> The litmus test below shows this:
>> (test courtesy Frederic Weisbecker, Changes for ctrldep by Boqun/me):
>> 
>> C srcu
>> 
>> {}
>> 
>> // updater
>> P0(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1)
>> {
>>        int lock1;
>>        int unlock1;
>>        int lock0;
>>        int unlock0;
>> 
>>        // SCAN1
>>        unlock1 = READ_ONCE(*UNLOCK1);
>>        smp_mb(); // A
>>        lock1 = READ_ONCE(*LOCK1);
>> 
>>        // FLIP
>>        if (lock1 == unlock1) {         // Control dep
>>                smp_mb(); // E
>>                WRITE_ONCE(*IDX, 1);
>>                smp_mb(); // D
>> 
>>                // SCAN2
>>                unlock0 = READ_ONCE(*UNLOCK0);
>>                smp_mb(); // A
>>                lock0 = READ_ONCE(*LOCK0);
>>        }
>> }
>> 
>> // reader
>> P1(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1)
>> {
>>        int tmp;
>>        int idx1;
>>        int idx2;
>> 
>>        // 1st reader
>>        idx1 = READ_ONCE(*IDX);
>>        if (idx1 == 0) {
>>                tmp = READ_ONCE(*LOCK0);
>>                WRITE_ONCE(*LOCK0, tmp + 1);
>>                smp_mb(); /* B and C */
>>                tmp = READ_ONCE(*UNLOCK0);
>>                WRITE_ONCE(*UNLOCK0, tmp + 1);
>>        } else {
>>                tmp = READ_ONCE(*LOCK1);
>>                WRITE_ONCE(*LOCK1, tmp + 1);
>>                smp_mb(); /* B and C */
>>                tmp = READ_ONCE(*UNLOCK1);
>>                WRITE_ONCE(*UNLOCK1, tmp + 1);
>>        }
>> 
>>        // second reader
>>        idx2 = READ_ONCE(*IDX);
>>        if (idx2 == 0) {
>>                tmp = READ_ONCE(*LOCK0);
>>                WRITE_ONCE(*LOCK0, tmp + 1);
>>                smp_mb(); /* B and C */
>>                tmp = READ_ONCE(*UNLOCK0);
>>                WRITE_ONCE(*UNLOCK0, tmp + 1);
>>        } else {
>>                tmp = READ_ONCE(*LOCK1);
>>                WRITE_ONCE(*LOCK1, tmp + 1);
>>                smp_mb(); /* B and C */
>>                tmp = READ_ONCE(*UNLOCK1);
>>                WRITE_ONCE(*UNLOCK1, tmp + 1);
>>        }
>> }
>> 
>> The following bad condition will not occur even if memory barrier E
>> is dropped:
>> 
>> (* bad condition: SCAN1 saw lock count changes though 1st reader saw flip *)
>> exists (0:lock1=1 /\ 1:idx1=1 /\ 1:idx2=1)
> 
> Good catch!
> 
> The litmus test can be shorten with removing the second reader and limit the
> exists clause to 0:lock1=1 :

Yes. Made the change and that works too.

> 
> ---
> C srcu-E
> 
> {}
> 
> // updater
> P0(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1)
> {
>    int lock1;
>    int unlock1;
>    int lock0;
>    int unlock0;
> 
>    // SCAN1
>    unlock1 = READ_ONCE(*UNLOCK1);
>    smp_mb(); // A
>    lock1 = READ_ONCE(*LOCK1);
> 
>    // FLIP
>    if (lock1 == unlock1) {         // Control dep
>        WRITE_ONCE(*IDX, 1);
>        smp_mb(); // D
> 
>        // SCAN2
>        unlock0 = READ_ONCE(*UNLOCK0);
>        smp_mb(); // A
>        lock0 = READ_ONCE(*LOCK0);
>    }
> }
> 
> // reader
> P1(int *IDX, int *LOCK0, int *UNLOCK0, int *LOCK1, int *UNLOCK1)
> {
>    int tmp;
>    int idx1;
>    int idx2;
> 
>    // 1st reader
>    idx1 = READ_ONCE(*IDX);
>    if (idx1 == 0) {
>        tmp = READ_ONCE(*LOCK0);
>        WRITE_ONCE(*LOCK0, tmp + 1);
>        smp_mb(); /* B and C */
>        tmp = READ_ONCE(*UNLOCK0);
>        WRITE_ONCE(*UNLOCK0, tmp + 1);
>    } else {
>        tmp = READ_ONCE(*LOCK1);
>        WRITE_ONCE(*LOCK1, tmp + 1);
>        smp_mb(); /* B and C */
>        tmp = READ_ONCE(*UNLOCK1);
>        WRITE_ONCE(*UNLOCK1, tmp + 1);
>    }
> }
> exists (0:lock1=1) (* only happens if the control dep is removed *)
> --
> 
>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>> index 1c304fec89c0..d1368d64fdba 100644
>> --- a/kernel/rcu/srcutree.c
>> +++ b/kernel/rcu/srcutree.c
>> @@ -983,15 +983,9 @@ static bool try_check_zero(struct srcu_struct *ssp, int idx, int trycount)
>> static void srcu_flip(struct srcu_struct *ssp)
>> {
>>    /*
>> -     * Ensure that if this updater saw a given reader's increment
>> -     * from __srcu_read_lock(), that reader was using an old value
>> -     * of ->srcu_idx.  Also ensure that if a given reader sees the
>> -     * new value of ->srcu_idx, this updater's earlier scans cannot
>> -     * have seen that reader's increments (which is OK, because this
>> -     * grace period need not wait on that reader).
>> +     * Control dependency locks==unlocks ensures that if a reader sees the
>> +     * flip, previous scan would only see before-flip reader lock counts.
> 
> I would personally prefer that we keep the detailed comment explaining the
> ordering requirements here, just mentioning it's performed via control
> dependency now. This way we don't need to debate that again in 10 years.

Cool, will update the patch.

Thanks.



> 
> 
>>     */
>> -    smp_mb(); /* E */  /* Pairs with B and C. */
>> -
>>    WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
>> 
>>    /*
>> -- 
>> 2.39.0.314.g84b9a713c41-goog
>> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ