[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150409072038.GA30205@gmail.com>
Date: Thu, 9 Apr 2015 09:20:39 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Viresh Kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...hat.com>, linaro-kernel@...ts.linaro.org,
linux-kernel@...r.kernel.org,
Preeti U Murthy <preeti@...ux.vnet.ibm.com>
Subject: Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct
check of the active list
* Ingo Molnar <mingo@...nel.org> wrote:
>
> * Peter Zijlstra <peterz@...radead.org> wrote:
>
> > On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote:
> > > Btw., does cpu_base->active_bases even make sense? hrtimer bases are
> > > fundamentally percpu, and to check whether there are any pending
> > > timers is a very simple check:
> > >
> > > base->active->next != NULL
> > >
> >
> > Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a
> > single bit test on an already loaded word for 3 potential cacheline
> > misses.
>
> But the clock bases are not aligned to cachelines, and we have 4 of
> them. So in practice when we access one, we'll load the next one
> anyway.
>
> Furthermore the simplification is measurable, and a fair bit of it is
> in various fast paths. I'd rather trade a bit of a cacheline footprint
> for less overall complexity and faster code.
Plus, look at this code in hrtimer_run_queues():
for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
base = &cpu_base->clock_base[index];
if (!base->active.next)
continue;
if (gettime) {
hrtimer_get_softirq_time(cpu_base);
gettime = 0;
}
if at least one base is active (on my fairly standard system all cpus
have at least one active hrtimer base all the time - and many cpus
have two bases active), then we run hrtimer_get_softirq_time(), which
dirties the cachelines of all 4 clock bases:
base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim;
base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono;
base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot;
base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai;
so in practice we not only touch every cacheline in every timer
interrupt, but we _dirty_ them, even the inactive ones.
So I'd strongly argue in favor of this patch series of simplification:
it makes the code simpler and faster, and won't impact cache footprint
in practice.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists