lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5aa51fc1-5a5c-0c61-5c28-0d9ca98e4514@gmail.com>
Date:   Fri, 28 Dec 2018 07:39:32 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Anna-Maria Gleixner <anna-maria@...utronix.de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Grygorii Strashko <grygorii.strashko@...com>
Subject: Re: Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may
 have revealed another problem

On 28.12.2018 07:34, Heiner Kallweit wrote:
> On 28.12.2018 02:31, Frederic Weisbecker wrote:
>> On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote:
>>>
> [...]
>>
>> Interesting, the softirq is raised from hardirq but it's not handled in the end of
>> the IRQ. Are you running threaded IRQS by any chance? If so I would expect ksoftirqd
>> to handle the pending work before we go idle. However I can imagine a small window
>> where such an expectation may not be met: if the softirq is raised after the ksoftirqd
>> thread is parked (CPUHP_AP_SMPBOOT_THREADS), which is right before we disable the CPU
>> (CPUHP_TEARDOWN_CPU).
>>
> I have a network driver (r8169) using NAPI which runs in softirq context AFAIK.
> For testing purposes I sometimes trigger system suspend via network, so there is
> network adapter activity when system suspends. Apart from that nothing really
> exciting:
>             CPU0       CPU1       CPU2       CPU3
>    0:         43          0          0          0   IO-APIC    2-edge      timer
>    1:          4          0          0          0   IO-APIC    1-edge      i8042
>    8:          0          1          0          0   IO-APIC    8-fasteoi   rtc0
>    9:          0          0          0          0   IO-APIC    9-fasteoi   acpi
>   12:          0          0          0          5   IO-APIC   12-edge      i8042
>  120:          0          0          0          0   PCI-MSI 311296-edge      PCIe PME
>  121:          0          0          0          0   PCI-MSI 315392-edge      PCIe PME
>  122:          0          0          0          0   PCI-MSI 327680-edge      PCIe PME
>  123:          0          0       3328          0   PCI-MSI 294912-edge      ahci[0000:00:12.0]
>  124:          0        133          0          0   PCI-MSI 344064-edge      xhci_hcd
>  125:          0          0         32          0   PCI-MSI 245760-edge      mei_me
>  127:        381          0          0          0   PCI-MSI 1572864-edge      enp3s0
>  128:          0          0          0        236   PCI-MSI 32768-edge      i915
>  129:          0        374          0          0   PCI-MSI 229376-edge      snd_hda_intel:card0
> 
>> I don't know if we can afford to ignore a softirq even at this late stage. We should
>> probably avoid leaking any. So here is a possible fix, if you don't mind trying:
>>
> I tested your patch and at least in the first minutes of testing couldn't reproduce
> the issue any longer. I tested manual system suspend and the following script you
> sent when we started to analyze the issue.
> 

Also after some more time the issue didn't occur again. So it seems your analysis
was right and also the approach to fix it. Thanks!
Will let you know in case the issue should pop up again under special
circumstances.


> Heiner
> 
> --------------------------------------------------------------------------
> 
> #!/bin/bash
> 
> do_hotplug()
> {
> 	for i in $(seq 1 $2)
> 	do
> 		echo $1 > /sys/devices/system/cpu/cpu$i/online
> 	done
> }
> 
> LAST_CPU=$(($(nproc)-1))
> 
> while true
> do
> 	do_hotplug 0 $LAST_CPU
> 	do_hotplug 1 $LAST_CPU
> done
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ