[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140201232245.GB4333@linux.vnet.ibm.com>
Date: Sat, 1 Feb 2014 15:22:45 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: George Spelvin <linux@...izon.com>, Waiman.Long@...com,
akpm@...ux-foundation.org, andi@...stfloor.org, arnd@...db.de,
aswin@...com, hpa@...or.com, linux-arch@...r.kernel.org,
linux-kernel@...r.kernel.org, mingo@...hat.com,
raghavendra.kt@...ux.vnet.ibm.com, riel@...hat.com,
rostedt@...dmis.org, scott.norton@...com, tglx@...utronix.de,
tim.c.chen@...ux.intel.com, torvalds@...ux-foundation.org,
walken@...gle.com, x86@...nel.org
Subject: Re: [PATCH v11 0/4] Introducing a queue read/write lock
implementation
On Fri, Jan 31, 2014 at 11:17:29AM +0100, Peter Zijlstra wrote:
> On Fri, Jan 31, 2014 at 05:03:48AM -0500, George Spelvin wrote:
> > How about getting rid of that TICKET_MSB mess and doing something like:
> >
> > #define TICKET_MASK 0xFFFF
> >
> > static inline void ticket_spin_unlock(atomic_t *tickets)
> > {
> > u32 t = *tickets;
> >
> > smp_mb__before_atomic_inc();
> >
> > /* Increment the low 16 bits without affecting the upper. */
> > if (unlikely((~t & TICKET_MASK) == 0))
> > atomic_add(-(atomic_t)TICKET_MASK, tickets);
> > else
> > atomic_inc(tickets);
> > }
> >
> > That also allows up to 2^16 waiters, up from 2^15.
> > (Minus one on both cases, if you want to be fussy.)
>
> Ah indeed. That'll work. That said, any arch that can single-copy
> address shorts can probably do better than this generic atomic_t thing.
>
> My main point was that we should seriously look at a ticket lock instead
> of the MCS one, because while the MCS has better contention behaviour,
> we shouldn't optimize locks for the worst contention.
We do need to articulate what we -are- optimizing for, or more generally,
what we are trying to accomplish with these locking primitives. Here
are some of the traditional goals:
1. Avoid performance collapse under heavy load. Here the goal is
to have throughput level off rather than decreasing when load
exceeds saturation levels.
2. Avoid starvation of requesters. Here the goal is a very
minimal level of fairness.
3. Achieve some reasonable level of fairness if the underlying
hardware provides fair access to memory. Here "reasonable"
might mean that lock-grant probabilities not differ by more
than (say) a factor of two.
4. Achieve some reasonable level of fairness even if the underlying
hardware is unfair. Rumor has it that some of the very large
systems in use today do not guarantee fair access to cache lines
under heavy memory contention.
5. Achieve some minimal level of scalability, so that throughput
increases monotonically with increasing load. Preferably
without penalizing small-system performance.
6. Achieve some reasonable level of scalability, so that adding
a given CPU provides at least some fraction (say, 80%) of
one CPU's worth of incremental throughput.
7. Achieve linear scalability.
Ticket locking has problems with #1 on some systems due to the thundering
herd of reads following each invalidation. (Yes, it is not as bad
as the atomic-operation thundering herd associated with test-and-set
spinlocks, but it can neverthelless be a problem on larger systems.)
Achieving #4 requires some sort of queued lock as far as I can see.
Although you -might- get #5 with a very clever NUMA-aware lock, #6 and
#7 will require some level of redesign.
Fancy locks can help if the common code path is fully parallelized,
but where some rare error path is not worth the effort. In this case,
using a fancy lock on the error path can at least keep the system moving
during the time that the error condition is in force, and hopefully
permit returning to full throughput once the error condition is gone.
So, what exactly are we trying to achieve with all this? ;-)
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists