lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 9 Apr 2015 09:20:39 +0200 From: Ingo Molnar <mingo@...nel.org> To: Peter Zijlstra <peterz@...radead.org> Cc: Thomas Gleixner <tglx@...utronix.de>, Viresh Kumar <viresh.kumar@...aro.org>, Ingo Molnar <mingo@...hat.com>, linaro-kernel@...ts.linaro.org, linux-kernel@...r.kernel.org, Preeti U Murthy <preeti@...ux.vnet.ibm.com> Subject: Re: [PATCH] hrtimer: Replace cpu_base->active_bases with a direct check of the active list * Ingo Molnar <mingo@...nel.org> wrote: > > * Peter Zijlstra <peterz@...radead.org> wrote: > > > On Thu, Apr 09, 2015 at 08:28:41AM +0200, Ingo Molnar wrote: > > > Btw., does cpu_base->active_bases even make sense? hrtimer bases are > > > fundamentally percpu, and to check whether there are any pending > > > timers is a very simple check: > > > > > > base->active->next != NULL > > > > > > > Yeah, that's 3 pointer dereferences from cpu_base, iow you traded a > > single bit test on an already loaded word for 3 potential cacheline > > misses. > > But the clock bases are not aligned to cachelines, and we have 4 of > them. So in practice when we access one, we'll load the next one > anyway. > > Furthermore the simplification is measurable, and a fair bit of it is > in various fast paths. I'd rather trade a bit of a cacheline footprint > for less overall complexity and faster code. Plus, look at this code in hrtimer_run_queues(): for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) { base = &cpu_base->clock_base[index]; if (!base->active.next) continue; if (gettime) { hrtimer_get_softirq_time(cpu_base); gettime = 0; } if at least one base is active (on my fairly standard system all cpus have at least one active hrtimer base all the time - and many cpus have two bases active), then we run hrtimer_get_softirq_time(), which dirties the cachelines of all 4 clock bases: base->clock_base[HRTIMER_BASE_REALTIME].softirq_time = xtim; base->clock_base[HRTIMER_BASE_MONOTONIC].softirq_time = mono; base->clock_base[HRTIMER_BASE_BOOTTIME].softirq_time = boot; base->clock_base[HRTIMER_BASE_TAI].softirq_time = tai; so in practice we not only touch every cacheline in every timer interrupt, but we _dirty_ them, even the inactive ones. So I'd strongly argue in favor of this patch series of simplification: it makes the code simpler and faster, and won't impact cache footprint in practice. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists