lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 14 Feb 2019 20:05:49 +0100 From: Heiner Kallweit <hkallweit1@...il.com> To: Frederic Weisbecker <frederic@...nel.org> Cc: Thomas Gleixner <tglx@...utronix.de>, Anna-Maria Gleixner <anna-maria@...utronix.de>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org> Subject: Re: Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may have revealed another problem On 24.01.2019 20:37, Heiner Kallweit wrote: > On 16.01.2019 07:24, Frederic Weisbecker wrote: >> On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote: >>> >>> # tracer: nop >>> # >>> # _-----=> irqs-off >>> # / _----=> need-resched >>> # | / _---=> hardirq/softirq >>> # || / _--=> preempt-depth >>> # ||| / delay >>> # TASK-PID CPU# |||| TIMESTAMP FUNCTION >>> # | | | |||| | | >>> <idle>-0 [001] d.h2 1479.099092: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.099098: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.099106: softirq_raise: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.099114: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.099120: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.099121: softirq_entry: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.099134: softirq_exit: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.099135: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.099141: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.100094: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.100109: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.100116: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.101091: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.101113: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.101118: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.102094: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.102114: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.102121: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.103091: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.103097: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.103105: softirq_raise: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.103114: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.103118: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.103119: softirq_entry: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.103131: softirq_exit: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.103132: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.103138: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.105092: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.105115: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.105119: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.106092: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.106112: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] .Ns2 1479.106144: softirq_exit: vec=9 [action=RCU] >>> cpuhp/1-13 [001] d..2 1479.106279: timer_cancel: timer=0000000009a25653 >>> <idle>-0 [001] d.h2 1479.106965: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.106969: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.106974: softirq_raise: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.106981: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.106984: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.106985: softirq_entry: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.106994: softirq_exit: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.106995: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.106999: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.107996: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.108010: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.108014: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.109009: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.109013: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.109024: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.109028: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.109028: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.109033: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.110013: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.110033: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] ..s2 1479.110040: softirq_exit: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.111011: softirq_raise: vec=1 [action=TIMER] >>> <idle>-0 [001] d.h2 1479.111017: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] d.h2 1479.111026: softirq_raise: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.111035: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.111040: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] ..s2 1479.111040: softirq_entry: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.111052: softirq_exit: vec=7 [action=SCHED] >>> <idle>-0 [001] ..s2 1479.111052: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] .Ns2 1479.111079: softirq_exit: vec=9 [action=RCU] >>> cpuhp/1-13 [001] dNh2 1479.112930: softirq_raise: vec=1 [action=TIMER] >>> cpuhp/1-13 [001] dNh2 1479.112935: softirq_raise: vec=9 [action=RCU] >>> <idle>-0 [001] d..1 1479.113077: <stack trace> >>> => can_stop_idle_tick.isra.14 >>> => tick_nohz_get_sleep_length >>> => menu_select >>> => cpuidle_select >>> => do_idle >>> => cpu_startup_entry >>> => start_secondary >>> => secondary_startup_64 >>> <idle>-0 [001] .Ns2 1479.113110: softirq_entry: vec=1 [action=TIMER] >>> <idle>-0 [001] .Ns2 1479.113114: softirq_exit: vec=1 [action=TIMER] >>> <idle>-0 [001] .Ns2 1479.113115: softirq_entry: vec=9 [action=RCU] >>> <idle>-0 [001] .Ns2 1479.113139: softirq_exit: vec=9 [action=RCU] >> >> >> So, I'm being very doubtful now actually. >> >> First of all, I realize my patch only fixes things halfway. Only softirq raised from >> hardirqs are going to be handled after ksoftirqd is parked. If it's raised outside >> interrupts, the softirq is going to be ignored. If other smpboot kthreads, following >> ksoftirqd parking, raise softirqs from their park() callback we should at least warn >> that we are ignoring that. >> >> Also, I'm still not sure if you're running threaded IRQs. Very likely not as >> the softirqs in your traces are executed in idle, which means from an IRQ interrupting >> idle. You haven't passed "threadirqs" in your kernel parameters, right? >> In any case we need to figure out why your softirqs raised at 1479.112930 and >> 1479.112935 don't get handled. >> >> Finally I don't understand why you have so many timer softirqs and yet no timer traces >> at all, I turned them on in the script I gave you. So something weird is happening with >> the timers code. At best we are wasting lots of CPU cycles with those empty softirqs, at >> worst we are miscalculating some expiration somewhere, possibly ignoring a timer. >> >> So I'll need to cook up an updated debug patch for you to trigger if you don't >> mind. >> > Hi Frederic, > is the update debug patch still to come? > >> Thanks. >> > Heiner > Hi Frederic, before it's forgotten: If I can test more, just let me know. Heiner
Powered by blists - more mailing lists