lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 9 Jan 2019 23:20:50 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Anna-Maria Gleixner <anna-maria@...utronix.de>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Grygorii Strashko <grygorii.strashko@...com>
Subject: Re: Fix 80d20d35af1e ("nohz: Fix local_timer_softirq_pending()") may
 have revealed another problem

On 28.12.2018 07:39, Heiner Kallweit wrote:
> On 28.12.2018 07:34, Heiner Kallweit wrote:
>> On 28.12.2018 02:31, Frederic Weisbecker wrote:
>>> On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote:
>>>>
>> [...]
>>>
>>> Interesting, the softirq is raised from hardirq but it's not handled in the end of
>>> the IRQ. Are you running threaded IRQS by any chance? If so I would expect ksoftirqd
>>> to handle the pending work before we go idle. However I can imagine a small window
>>> where such an expectation may not be met: if the softirq is raised after the ksoftirqd
>>> thread is parked (CPUHP_AP_SMPBOOT_THREADS), which is right before we disable the CPU
>>> (CPUHP_TEARDOWN_CPU).
>>>
>> I have a network driver (r8169) using NAPI which runs in softirq context AFAIK.
>> For testing purposes I sometimes trigger system suspend via network, so there is
>> network adapter activity when system suspends. Apart from that nothing really
>> exciting:
>>             CPU0       CPU1       CPU2       CPU3
>>    0:         43          0          0          0   IO-APIC    2-edge      timer
>>    1:          4          0          0          0   IO-APIC    1-edge      i8042
>>    8:          0          1          0          0   IO-APIC    8-fasteoi   rtc0
>>    9:          0          0          0          0   IO-APIC    9-fasteoi   acpi
>>   12:          0          0          0          5   IO-APIC   12-edge      i8042
>>  120:          0          0          0          0   PCI-MSI 311296-edge      PCIe PME
>>  121:          0          0          0          0   PCI-MSI 315392-edge      PCIe PME
>>  122:          0          0          0          0   PCI-MSI 327680-edge      PCIe PME
>>  123:          0          0       3328          0   PCI-MSI 294912-edge      ahci[0000:00:12.0]
>>  124:          0        133          0          0   PCI-MSI 344064-edge      xhci_hcd
>>  125:          0          0         32          0   PCI-MSI 245760-edge      mei_me
>>  127:        381          0          0          0   PCI-MSI 1572864-edge      enp3s0
>>  128:          0          0          0        236   PCI-MSI 32768-edge      i915
>>  129:          0        374          0          0   PCI-MSI 229376-edge      snd_hda_intel:card0
>>
>>> I don't know if we can afford to ignore a softirq even at this late stage. We should
>>> probably avoid leaking any. So here is a possible fix, if you don't mind trying:
>>>
>> I tested your patch and at least in the first minutes of testing couldn't reproduce
>> the issue any longer. I tested manual system suspend and the following script you
>> sent when we started to analyze the issue.
>>
> 
> Also after some more time the issue didn't occur again. So it seems your analysis
> was right and also the approach to fix it. Thanks!
> Will let you know in case the issue should pop up again under special
> circumstances.
> 
Frederic, so far this fix didn't appear in linux-next, are you going to submit it?

> 
>> Heiner
>>
>> --------------------------------------------------------------------------
>>
>> #!/bin/bash
>>
>> do_hotplug()
>> {
>> 	for i in $(seq 1 $2)
>> 	do
>> 		echo $1 > /sys/devices/system/cpu/cpu$i/online
>> 	done
>> }
>>
>> LAST_CPU=$(($(nproc)-1))
>>
>> while true
>> do
>> 	do_hotplug 0 $LAST_CPU
>> 	do_hotplug 1 $LAST_CPU
>> done
>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ