[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb96e032-4f7d-41bf-a675-81350dca8d0a@paulmck-laptop>
Date: Mon, 11 Nov 2024 17:28:44 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
Cc: frederic@...nel.org, rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-team@...a.com, rostedt@...dmis.org,
kernel test robot <oliver.sang@...el.com>,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Kent Overstreet <kent.overstreet@...ux.dev>, bpf@...r.kernel.org
Subject: Re: [PATCH rcu 06/12] srcu: Add srcu_read_lock_lite() and
srcu_read_unlock_lite()
On Mon, Nov 11, 2024 at 10:21:58PM +0530, Neeraj Upadhyay wrote:
> On 11/11/2024 8:56 PM, Paul E. McKenney wrote:
> > On Mon, Nov 11, 2024 at 06:24:58PM +0530, Neeraj Upadhyay wrote:
> >>> /*
> >>> - * Returns approximate total of the readers' ->srcu_lock_count[] values
> >>> - * for the rank of per-CPU counters specified by idx.
> >>> + * Computes approximate total of the readers' ->srcu_lock_count[] values
> >>> + * for the rank of per-CPU counters specified by idx, and returns true if
> >>> + * the caller did the proper barrier (gp), and if the count of the locks
> >>> + * matches that of the unlocks passed in.
> >>> */
> >>> -static unsigned long srcu_readers_lock_idx(struct srcu_struct *ssp, int idx)
> >>> +static bool srcu_readers_lock_idx(struct srcu_struct *ssp, int idx, bool gp, unsigned long unlocks)
> >>> {
> >>> int cpu;
> >>> + unsigned long mask = 0;
> >>> unsigned long sum = 0;
> >>>
> >>> for_each_possible_cpu(cpu) {
> >>> struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu);
> >>>
> >>> sum += atomic_long_read(&sdp->srcu_lock_count[idx]);
> >>> + if (IS_ENABLED(CONFIG_PROVE_RCU))
> >>> + mask = mask | READ_ONCE(sdp->srcu_reader_flavor);
> >>> }
> >>> - return sum;
> >>> + WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) && (mask & (mask - 1)),
> >>> + "Mixed reader flavors for srcu_struct at %ps.\n", ssp);
> >>
> >> I am trying to understand the (unlikely) case where synchronize_srcu() is done before any
> >> srcu reader lock/unlock lite call is done. Can new SRCU readers fail to observe the
> >> updates?
> >
> > If a SRCU reader fail to observe the index flip, then isn't it the case
> > that the synchronize_rcu() invoked from srcu_readers_active_idx_check()
> > must wait on it?
>
> Below is the sequence of operations I was thinking of, where at step 4 CPU2
> reads old pointer
>
> ptr = old
>
>
> CPU1 CPU2
>
> 1. Update ptr = new
>
> 2. synchronize_srcu()
>
> <Does not use synchronize_rcu()
> as SRCU_READ_FLAVOR_LITE is not
> set for any sdp as srcu_read_lock_lite()
> hasn't been called by any CPU>
>
> 3. srcu_read_lock_lite()
> <No smp_mb() ordering>
>
> 4. Can read ptr == old ?
As long as the kernel was built with CONFIG_PROVE_RCU=y and given a fix
to the wrong-CPU issue you quite rightly point out below, no it cannot.
The CPU's first call to srcu_read_lock_lite() will use cmpxchg() to update
->srcu_reader_flavor, which will place full ordering between that update
and the later read from "ptr".
So if the synchronize_srcu() is too early to see the SRCU_READ_FLAVOR_LITE
bit, then the reader must see the new value of "ptr". Similarly,
if the reader can see the old value of "ptr", then synchronize_srcu()
must see the reader's setting of the SRCU_READ_FLAVOR_LITE bit.
But both the CONFIG_PROVE_RCU=n and the wrong-CPU issue must be fixed
for this to work. Please see the upcoming patches to be posted as a
reply to this email.
> >>> + if (mask & SRCU_READ_FLAVOR_LITE && !gp)
> >>> + return false;
> >>
> >> So, srcu_readers_active_idx_check() can potentially return false for very long
> >> time, until the CPU executing srcu_readers_active_idx_check() does
> >> at least one read lock/unlock lite call?
> >
> > That is correct. The theory is that until after an srcu_read_lock_lite()
> > has executed, there is no need to wait on it. Does the practice match the
> > theory in this case, or is there some sequence of events that I missed?
>
> Below sequence
>
> CPU1 CPU2
> 1. srcu_read_lock_lite()
>
>
> 2. srcu_read_unlock_lite()
>
> 3. synchronize_srcu()
>
> 3.1 srcu_readers_lock_idx() is
> called with gp = false as
> srcu_read_lock_lite() was never
> called on this CPU for this
> srcu_struct. So
> ssp->sda->srcu_reader_flavor is not
> set for CPU1's sda.
Good eyes! Yes, the scan that sums the ->srcu_unlock_count[] counters
must also OR together the ->srcu_reader_flavor fields.
> 3.2 Inside srcu_readers_lock_idx()
> "mask" contains SRCU_READ_FLAVOR_LITE
> as CPU2's sdp->srcu_reader_flavor has it.
>
> 3.3 CPU1 keeps returning false from
> below check until CPU1 does at least
> one srcu_read_lock_lite() call or
> the thread migrates.
>
> if (mask & SRCU_READ_FLAVOR_LITE && !gp)
> return false;
This is also fixed by the OR of the ->srcu_reader_flavor fields, correct?
I guess I could claim that this bug prevents the wrong-CPU bug above
from resulting in a too-short SRCU grace period, but it is of course
better to just fix the bugs. ;-)
> >>> + return sum == unlocks;
> >>> }
> >>>
> >>> /*
> >>> @@ -473,6 +482,7 @@ static unsigned long srcu_readers_unlock_idx(struct srcu_struct *ssp, int idx)
> >>> */
> >>> static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx)
> >>> {
> >>> + bool did_gp = !!(raw_cpu_read(ssp->sda->srcu_reader_flavor) & SRCU_READ_FLAVOR_LITE);
> >>
> >> sda->srcu_reader_flavor is only set when CONFIG_PROVE_RCU is enabled. But we
> >> need the reader flavor information for srcu lite variant to work. So, lite
> >> variant does not work when CONFIG_PROVE_RCU is disabled. Am I missing something
> >> obvious here?
> >
> > At first glance, it appears that I am the one who missed something obvious.
> > Including in testing, which failed to uncover this issue.
> >
> > Thank you for the careful reviews!
>
> Sure thing, no problem!
And again, thank you!
Thanx, Paul
> >>> unsigned long unlocks;
> >>>
> >>> unlocks = srcu_readers_unlock_idx(ssp, idx);
> >>> @@ -482,13 +492,16 @@ static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx)
> >>> * unlock is counted. Needs to be a smp_mb() as the read side may
> >>> * contain a read from a variable that is written to before the
> >>> * synchronize_srcu() in the write side. In this case smp_mb()s
> >>> - * A and B act like the store buffering pattern.
> >>> + * A and B (or X and Y) act like the store buffering pattern.
> >>> *
> >>> - * This smp_mb() also pairs with smp_mb() C to prevent accesses
> >>> - * after the synchronize_srcu() from being executed before the
> >>> - * grace period ends.
> >>> + * This smp_mb() also pairs with smp_mb() C (or, in the case of X,
> >>> + * Z) to prevent accesses after the synchronize_srcu() from being
> >>> + * executed before the grace period ends.
> >>> */
> >>> - smp_mb(); /* A */
> >>> + if (!did_gp)
> >>> + smp_mb(); /* A */
> >>> + else
> >>> + synchronize_rcu(); /* X */
> >>>
> >>> /*
> >>> * If the locks are the same as the unlocks, then there must have
> >>> @@ -546,7 +559,7 @@ static bool srcu_readers_active_idx_check(struct srcu_struct *ssp, int idx)
> >>> * which are unlikely to be configured with an address space fully
> >>> * populated with memory, at least not anytime soon.
> >>> */
> >>> - return srcu_readers_lock_idx(ssp, idx) == unlocks;
> >>> + return srcu_readers_lock_idx(ssp, idx, did_gp, unlocks);
> >>> }
> >>>
> >>
Powered by blists - more mailing lists