lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 10 Jul 2014 15:47:54 +0530 From: Viresh Kumar <viresh.kumar@...aro.org> To: Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org> Cc: Frédéric Weisbecker <fweisbec@...il.com>, Preeti U Murthy <preeti@...ux.vnet.ibm.com>, Lists linaro-kernel <linaro-kernel@...ts.linaro.org>, Linaro Networking <linaro-networking@...aro.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Steven Rostedt <rostedt@...dmis.org>, Kevin Hilman <khilman@...aro.org>, Santosh Shukla <santosh.shukla@...aro.org>, Arvind Chauhan <Arvind.Chauhan@....com> Subject: [Bug] Spurious hrtimer-interrupts Hi Thomas/Daniel et al, This isn't about the problem I reported earlier, where you advised to add ONESHOT_STOPPED mode: https://lkml.org/lkml/2014/5/9/508. Above problem was about stopping the clock-event device when its not used anymore. This ($subject) problem was initially spotted on Ivybrdge V2, 12 core X86 server by Santosh. And then I reproduced it on Dual core ARM Exynos (isn't that frequent as it was on x86 though). Problem: Getting spurious ticks where hrtimer_interrupt() returns without servicing any hrtimers. Kernel hack to catch this: http://pastebin.com/bTM7nqDc (Over 3.16-rc3) X86 boot logs: http://pastebin.com/E6axDnsa (search: hrtimer_interrupt) /proc/cpuinfo: http://pastebin.com/uQx9TmsA The last I could debug it to is: - Clockevent device is programmed for time 'x' seconds (Verified this by storing next-event from within lapic_next_event()). - Tick fires ~300 us before 'x' - Traversing through the list of hrtimers doesn't result in any pending hrtimer and we simply return. And so *spurious* interrupt. - Happens when ticks are active or stopped (search for "tick-stopped" in logs) Driver monitored for x86: arch/x86/kernel/apic/apic.c Similar behavior observed on exynos with arm_arch_timer.c I couldn't get any deeper into it to see what's going on. From the behavior It looks lik the calculations we are doing with dev->mult/shift gives timeout <= next-event, whereas it should be >= ? Not at all sure though. Reported-by: Santosh Shukla <santosh.shukla@...aro.org> Note: Even the Hacky patchset that tried to to disable clockevent device when not used anymore, isn't able to fix it: https://lkml.org/lkml/2014/5/9/99.. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists