lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 16 Apr 2020 17:02:59 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     John Garry <john.garry@...wei.com>, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [Regression] No IO interrupt is generated before CPU is offline

Hi Thomas,

When I run test script [1] in KVM guest[2], and disk is virtio-scsi,
IO hang can be triggered easily. Most times, it can be reproduced
by running './cpuhotplug_io 400 /dev/sda' once, and sometimes it
needs one more run.

After I checked blk-mq debugfs log, I found these requests have
been queued to virtio-scsi hardware, but interrupts aren't be
generated.

The issue is firstly found when John and I test the patchset[3][4] for
draining IO in cpu hotplug handler before CPU and managed IRQ becomes
shudown. And IOs are found not completed even though the CPU responsible
for dealing with this hw queue is still online, but going to shutdown.

git-bisect shows that the issue is introduced by the following commit:

	60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")


The issue can't be triggered any more after applying the following change:

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 69881b2d446c..c5e9f005fbb2 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1596,7 +1596,7 @@ int native_cpu_disable(void)
         * it. It still responds normally to INIT, NMI, SMI, and SIPI
         * messages.
         */
-       apic_soft_disable();
+       clear_local_APIC();
        cpu_disable_common();
 
        return 0;


[1] test script
http://people.redhat.com/minlei/tests/tools/cpuhotplug_io

[2] virtio-scsi is MQ by passing 'num_queues=3' to qemu virtio-scsi
command line, meantime set cpu number as 8, so one queue can be covered
by more than one CPU

[3] https://lore.kernel.org/linux-block/20200407092901.314228-5-ming.lei@redhat.com/

[4] latest patches for stop & drain IO before shutdown irq/cpu
https://github.com/ming1/linux/commits/v5.6-blk-mq-improve-cpu-hotplug



Thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ