[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210723203248.GL4397@paulmck-ThinkPad-P17-Gen-1>
Date: Fri, 23 Jul 2021 13:32:48 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Manfred Spraul <manfred@...orfullife.com>,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
kernel-team@...com, mingo@...nel.org, parri.andrea@...il.com,
will@...nel.org, peterz@...radead.org, boqun.feng@...il.com,
npiggin@...il.com, dhowells@...hat.com, j.alglave@....ac.uk,
luc.maranget@...ia.fr, akiyks@...il.com
Subject: Re: [PATCH memory-model 2/4] tools/memory-model: Add example for
heuristic lockless reads
On Fri, Jul 23, 2021 at 01:08:20PM -0400, Alan Stern wrote:
> On Fri, Jul 23, 2021 at 09:30:08AM -0700, Paul E. McKenney wrote:
> > How about like this?
> >
> > Thanx, Paul
>
> Generally a lot better, but still at least one issue.
>
> > ------------------------------------------------------------------------
> >
> > Lock-Protected Writes With Heuristic Lockless Reads
> > ---------------------------------------------------
> >
> > For another example, suppose that the code can normally make use of
> > a per-data-structure lock, but there are times when a global lock
> > is required. These times are indicated via a global flag. The code
> > might look as follows, and is based loosely on nf_conntrack_lock(),
> > nf_conntrack_all_lock(), and nf_conntrack_all_unlock():
> >
> > bool global_flag;
> > DEFINE_SPINLOCK(global_lock);
> > struct foo {
> > spinlock_t f_lock;
> > int f_data;
> > };
> >
> > /* All foo structures are in the following array. */
> > int nfoo;
> > struct foo *foo_array;
> >
> > void do_something_locked(struct foo *fp)
> > {
> > /* IMPORTANT: Heuristic plus spin_lock()! */
> > if (!data_race(global_flag)) {
> > spin_lock(&fp->f_lock);
> > if (!smp_load_acquire(&global_flag)) {
> > do_something(fp);
> > spin_unlock(&fp->f_lock);
> > return;
> > }
> > spin_unlock(&fp->f_lock);
> > }
> > spin_lock(&global_lock);
> > /* global_lock held, thus global flag cannot be set. */
> > spin_lock(&fp->f_lock);
> > spin_unlock(&global_lock);
> > /*
> > * global_flag might be set here, but begin_global()
> > * will wait for ->f_lock to be released.
> > */
> > do_something(fp);
> > spin_lock(&fp->f_lock);
>
> spin_unlock.
Good eyes, fixed.
> > }
> >
> > void begin_global(void)
> > {
> > int i;
> >
> > spin_lock(&global_lock);
> > WRITE_ONCE(global_flag, true);
> > for (i = 0; i < nfoo; i++) {
> > /*
> > * Wait for pre-existing local locks. One at
> > * a time to avoid lockdep limitations.
> > */
> > spin_lock(&fp->f_lock);
> > spin_unlock(&fp->f_lock);
> > }
> > }
> >
> > void end_global(void)
> > {
> > smp_store_release(&global_flag, false);
> > spin_unlock(&global_lock);
> > }
> >
> > All code paths leading from the do_something_locked() function's first
> > read from global_flag acquire a lock, so endless load fusing cannot
> > happen.
> >
> > If the value read from global_flag is true, then global_flag is
> > rechecked while holding ->f_lock, which, if global_flag is now false,
> > prevents begin_global() from completing. It is therefore safe to invoke
> > do_something().
> >
> > Otherwise, if either value read from global_flag is true, then after
> > global_lock is acquired global_flag must be false. The acquisition of
> > ->f_lock will prevent any call to begin_global() from returning, which
> > means that it is safe to release global_lock and invoke do_something().
> >
> > For this to work, only those foo structures in foo_array[] may be passed
> > to do_something_locked(). The reason for this is that the synchronization
> > with begin_global() relies on momentarily holding the lock of each and
> > every foo structure.
>
> This doesn't mention the reason for the acquire-release
> synchronization of global_flag. It's needed because work done between
> begin_global() and end_global() can affect a foo structure without
> holding its private f_lock member, and we want all such work to be
> visible to other threads when they call do_something_locked() later.
Like this added paragraph at the end?
The smp_load_acquire() and smp_store_release() are required
because changes to a foo structure between calls to begin_global()
and end_global() are carried out without holding that structure's
->f_lock. The smp_load_acquire() and smp_store_release()
ensure that the next invocation of do_something() from the call
to do_something_locked() that acquires that ->f_lock will see
those changes.
Thanx, Paul
Powered by blists - more mailing lists