linux-kernel - Re: [PATCH v2 3/6] sched: make double-lock-balance fair

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080827102646.GA5818@wotan.suse.de>
Date:	Wed, 27 Aug 2008 12:26:46 +0200
From:	Nick Piggin <npiggin@...e.de>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Gregory Haskins <ghaskins@...ell.com>, mingo@...e.hu,
	srostedt@...hat.com, linux-kernel@...r.kernel.org,
	linux-rt-users@...r.kernel.org, gregory.haskins@...il.com
Subject: Re: [PATCH v2 3/6] sched: make double-lock-balance fair

On Wed, Aug 27, 2008 at 10:21:35AM +0200, Peter Zijlstra wrote:
> On Tue, 2008-08-26 at 13:35 -0400, Gregory Haskins wrote:
> > double_lock balance() currently favors logically lower cpus since they
> > often do not have to release their own lock to acquire a second lock.
> > The result is that logically higher cpus can get starved when there is
> > a lot of pressure on the RQs.  This can result in higher latencies on
> > higher cpu-ids.
> > 
> > This patch makes the algorithm more fair by forcing all paths to have
> > to release both locks before acquiring them again.  Since callsites to
> > double_lock_balance already consider it a potential preemption/reschedule
> > point, they have the proper logic to recheck for atomicity violations.
> > 
> > Signed-off-by: Gregory Haskins <ghaskins@...ell.com>
> > ---
> > 
> >  kernel/sched.c |   52 +++++++++++++++++++++++++++++++++++++++++++++-------
> >  1 files changed, 45 insertions(+), 7 deletions(-)
> > 
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index df6b447..850b454 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -2782,21 +2782,43 @@ static void double_rq_unlock(struct rq *rq1, struct rq *rq2)
> >  		__release(rq2->lock);
> >  }
> >  
> > +#ifdef CONFIG_PREEMPT
> > +
> >  /*
> > - * double_lock_balance - lock the busiest runqueue, this_rq is locked already.
> > + * fair double_lock_balance: Safely acquires both rq->locks in a fair
> > + * way at the expense of forcing extra atomic operations in all
> > + * invocations.  This assures that the double_lock is acquired using the
> > + * same underlying policy as the spinlock_t on this architecture, which
> > + * reduces latency compared to the unfair variant below.  However, it
> > + * also adds more overhead and therefore may reduce throughput.
> >   */
> > -static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
> > +static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
> > +	__releases(this_rq->lock)
> > +	__acquires(busiest->lock)
> > +	__acquires(this_rq->lock)
> > +{
> > +	spin_unlock(&this_rq->lock);
> > +	double_rq_lock(this_rq, busiest);
> > +
> > +	return 1;
> > +}
> 
> Right - so to belabour Nick's point:
> 
>   if (!spin_trylock(&busiest->lock)) {
>     spin_unlock(&this_rq->lock);
>     double_rq_lock(this_rq, busiest);
>   }
> 
> might unfairly treat someone who is waiting on this_rq if I understand
> it right?
> 
> I suppose one could then write it like:
> 
>   if (spin_is_contended(&this_rq->lock) || !spin_trylock(&busiest->lock)) {
>     spin_unlock(&this_rq->lock);
>     double_rq_lock(this_rq, busiest);
>   }
> 
> But, I'm not sure that's worth the effort at that point..

Yeah, that could work, but hmm it might cause 2 cache coherency transactions
anyway even in the fastpath, so it might even be slower than just unlocking
unconditionally and taking both locks :(

 
> Anyway - I think all this is utterly defeated on CONFIG_PREEMPT by the
> spin with IRQs enabled logic in kernel/spinlock.c.
> 
> Making this an -rt only patch...

Hmm, and also on x86 with ticket locks we don't spin with preempt or
interrupts enabled any more (although we still do of course on other
architectures)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/