[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55A7147B.8020406@hp.com>
Date: Wed, 15 Jul 2015 22:18:35 -0400
From: Waiman Long <waiman.long@...com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Scott J Norton <scott.norton@...com>,
Douglas Hatch <doug.hatch@...com>,
Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking
to unlock time
On 07/15/2015 06:03 AM, Peter Zijlstra wrote:
> On Tue, Jul 14, 2015 at 10:13:36PM -0400, Waiman Long wrote:
>> +static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
>> {
>> struct pv_node *pn = (struct pv_node *)node;
>>
>> + if (xchg(&pn->state, vcpu_running) == vcpu_running)
>> + return;
>> +
>> /*
>> + * Kicking the next node at lock time can actually be a bit faster
>> + * than doing it at unlock time because the critical section time
>> + * overlaps with the wakeup latency of the next node. However, if the
>> + * VM is too overcommmitted, it can happen that we need to kick the
>> + * CPU again at unlock time (double-kick). To avoid that and also to
>> + * fully utilize the kick-ahead functionality at unlock time,
>> + * the kicking will be deferred under either one of the following
>> + * 2 conditions:
>> *
>> + * 1) The VM guest has too few vCPUs that kick-ahead is not even
>> + * enabled. In this case, the chance of double-kick will be
>> + * higher.
>> + * 2) The node after the next one is also in the halted state.
>> *
>> + * In this case, the hashed flag is set to indicate that hashed
>> + * table has been filled and _Q_SLOW_VAL is set.
>> */
>> - if (xchg(&pn->state, vcpu_running) == vcpu_halted) {
>> - pvstat_inc(pvstat_lock_kick);
>> - pv_kick(pn->cpu);
>> + if ((!pv_kick_ahead || pv_get_kick_node(pn, 1))&&
>> + (xchg(&pn->hashed, 1) == 0)) {
>> + struct __qspinlock *l = (void *)lock;
>> +
>> + /*
>> + * As this is the same vCPU that will check the _Q_SLOW_VAL
>> + * value and the hash table later on at unlock time, no atomic
>> + * instruction is needed.
>> + */
>> + WRITE_ONCE(l->locked, _Q_SLOW_VAL);
>> + (void)pv_hash(lock, pn);
>> + return;
>> }
>> +
>> + /*
>> + * Kicking the vCPU even if it is not really halted is safe.
>> + */
>> + pvstat_inc(pvstat_lock_kick);
>> + pv_kick(pn->cpu);
>> }
>>
>> /*
>> @@ -513,6 +545,13 @@ static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node)
>> cpu_relax();
>> }
>>
>> + if (!lp&& (xchg(&pn->hashed, 1) == 1))
>> + /*
>> + * The hashed table& _Q_SLOW_VAL had been filled
>> + * by the lock holder.
>> + */
>> + lp = (struct qspinlock **)-1;
>> +
>> if (!lp) { /* ONCE */
>> lp = pv_hash(lock, pn);
>> /*
> *groan*, so you complained the previous version of this patch was too
> complex, but let me say I vastly preferred it to this one :/
I said it was complex as maintaining a tri-state variable needed more
thought than 2 bi-state variables. I can revert it back to the tri-state
variable as doing an unconditional kick in unlock simplifies the code at
pv_wait_head().
Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists