lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f60567f7-fa7e-f424-12e6-b3df7a24c0be@gmail.com>
Date:   Thu, 24 Jan 2019 20:37:20 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Anna-Maria Gleixner <anna-maria@...utronix.de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may
 have revealed another problem

On 16.01.2019 07:24, Frederic Weisbecker wrote:
> On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote:
>>
>> # tracer: nop
>> #
>> #                              _-----=> irqs-off
>> #                             / _----=> need-resched
>> #                            | / _---=> hardirq/softirq
>> #                            || / _--=> preempt-depth
>> #                            ||| /     delay
>> #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
>> #              | |       |   ||||       |         |
>>           <idle>-0     [001] d.h2  1479.099092: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.099098: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.099106: softirq_raise: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.099114: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.099120: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.099121: softirq_entry: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.099134: softirq_exit: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.099135: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.099141: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.100094: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.100109: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.100116: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.101091: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.101113: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.101118: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.102094: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.102114: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.102121: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.103091: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.103097: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.103105: softirq_raise: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.103114: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.103118: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.103119: softirq_entry: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.103131: softirq_exit: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.103132: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.103138: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.105092: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.105115: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.105119: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.106092: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.106112: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] .Ns2  1479.106144: softirq_exit: vec=9 [action=RCU]
>>          cpuhp/1-13    [001] d..2  1479.106279: timer_cancel: timer=0000000009a25653
>>           <idle>-0     [001] d.h2  1479.106965: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.106969: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.106974: softirq_raise: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.106981: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.106984: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.106985: softirq_entry: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.106994: softirq_exit: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.106995: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.106999: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.107996: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.108010: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.108014: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.109009: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.109013: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.109024: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.109028: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.109028: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.109033: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.110013: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.110033: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] ..s2  1479.110040: softirq_exit: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.111011: softirq_raise: vec=1 [action=TIMER]
>>           <idle>-0     [001] d.h2  1479.111017: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] d.h2  1479.111026: softirq_raise: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.111035: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.111040: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] ..s2  1479.111040: softirq_entry: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.111052: softirq_exit: vec=7 [action=SCHED]
>>           <idle>-0     [001] ..s2  1479.111052: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] .Ns2  1479.111079: softirq_exit: vec=9 [action=RCU]
>>          cpuhp/1-13    [001] dNh2  1479.112930: softirq_raise: vec=1 [action=TIMER]
>>          cpuhp/1-13    [001] dNh2  1479.112935: softirq_raise: vec=9 [action=RCU]
>>           <idle>-0     [001] d..1  1479.113077: <stack trace>
>>  => can_stop_idle_tick.isra.14
>>  => tick_nohz_get_sleep_length
>>  => menu_select
>>  => cpuidle_select
>>  => do_idle
>>  => cpu_startup_entry
>>  => start_secondary
>>  => secondary_startup_64
>>           <idle>-0     [001] .Ns2  1479.113110: softirq_entry: vec=1 [action=TIMER]
>>           <idle>-0     [001] .Ns2  1479.113114: softirq_exit: vec=1 [action=TIMER]
>>           <idle>-0     [001] .Ns2  1479.113115: softirq_entry: vec=9 [action=RCU]
>>           <idle>-0     [001] .Ns2  1479.113139: softirq_exit: vec=9 [action=RCU]
> 
> 
> So, I'm being very doubtful now actually.
> 
> First of all, I realize my patch only fixes things halfway. Only softirq raised from
> hardirqs are going to be handled after ksoftirqd is parked. If it's raised outside
> interrupts, the softirq is going to be ignored. If other smpboot kthreads, following
> ksoftirqd parking, raise softirqs from their park() callback we should at least warn
> that we are ignoring that.
> 
> Also, I'm still not sure if you're running threaded IRQs. Very likely not as
> the softirqs in your traces are executed in idle, which means from an IRQ interrupting
> idle. You haven't passed "threadirqs" in your kernel parameters, right?
> In any case we need to figure out why your softirqs raised at 1479.112930 and
> 1479.112935 don't get handled.
> 
> Finally I don't understand why you have so many timer softirqs and yet no timer traces
> at all, I turned them on in the script I gave you. So something weird is happening with
> the timers code. At best we are wasting lots of CPU cycles with those empty softirqs, at
> worst we are miscalculating some expiration somewhere, possibly ignoring a timer.
> 
> So I'll need to cook up an updated debug patch for you to trigger if you don't
> mind.
> 
Hi Frederic,
is the update debug patch still to come?

> Thanks.
> 
Heiner

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ