lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1435811074-30062-1-git-send-email-jin.xiao@intel.com>
Date:	Thu,  2 Jul 2015 12:24:34 +0800
From:	xiao jin <jin.xiao@...el.com>
To:	jroedel@...e.de, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com, x86@...nel.org, bp@...e.de,
	boris.ostrovsky@...cle.com, dave.hansen@...ux.intel.com,
	rientjes@...gle.com, imammedo@...hat.com,
	paulmck@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	peterz@...radead.org, mingo@...nel.org
Cc:	yanmin_zhang@...ux.intel.com, xiao jin <jin.xiao@...el.com>
Subject: [PATCH] smpboot.c: move setup_vector_irq after set_cpu_online

While running cpuhotplug + reboot test, I can easily hit a IPANIC on kernel 3.14.

[  106.107851] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[  106.116702] IP:
[  106.118490]  [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[  106.126809] PGD 0
[  106.129110] Oops: 0000 [#1] PREEMPT SMP
[  106.133613] Modules linked in: atomisp_css2401a0_v21 lm3554 ov2722 hid_sensor_hub sens_col_core hid_heci_ish heci_ish heci vidt_driver rfkill_gpi     o bcmdhd_pcie(O) cfg80211 ov5693 videobuf_vmalloc pn544_nfc(C) videobuf_core bt_lpm 6lowpan_iphc ip6table_raw iptable_raw atmel_mxt_ts
[  106.161897] CPU: 2 PID: 18 Comm: migration/2 Tainted: G        WC O 3.14.37-x86_64-L1-R467-g68db82c #1
[  106.172323] Hardware name: Intel Corporation CHERRYVIEW C0 PLATFORM/Cherry Trail FFD, BIOS CH2TFFD.X64.0004.R83.1506171149 06/17/2015
[  106.185758] task: ffff880077e98510 ti: ffff880077e9a000 task.ti: ffff880077e9a000
[  106.194143] RIP: 0010:[<ffffffff810044f6>]
[  106.198646]  [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[  106.206969] RSP: 0000:ffff880077e9bcf8  EFLAGS: 00010046
[  106.212926] RAX: 0000000000000000 RBX: 00000000000000d3 RCX: 0000000000000000
[  106.220921] RDX: 0000000000000000 RSI: 0000000000000088 RDI: 0000000000000001
[  106.228918] RBP: ffff880077e9bd28 R08: 0000000000000000 R09: ffff8800784008e0
[  106.236915] R10: 000000000000000a R11: 0000000000000000 R12: 0000000000000015
[  106.244911] R13: 0000000000000000 R14: 0000000000000088 R15: 0000000000000002
[  106.252898] FS:  0000000000000000(0000) GS:ffff88007a300000(0000) knlGS:0000000000000000
[  106.261961] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  106.268405] CR2: 0000000000000040 CR3: 000000006e03c000 CR4: 00000000001007e0
[  106.276400] Last Branch Records:
[  106.280052]    to: [<ffffffff819f3840>] page_fault+0x0/0x80
[  106.286335]  from: [<ffffffff810044f6>] check_irq_vectors_for_cpu_disable+0x76/0x180
[  106.295044]    to: [<ffffffff810044f3>] check_irq_vectors_for_cpu_disable+0x73/0x180
[  106.303749]  from: [<ffffffff810da4c8>] irq_to_desc+0x18/0x20
[  106.310227]    to: [<ffffffff810da4c7>] irq_to_desc+0x17/0x20
[  106.316701]  from: [<ffffffff8137347c>] radix_tree_lookup+0xc/0x10
[  106.323655]    to: [<ffffffff8137347b>] radix_tree_lookup+0xb/0x10
[  106.330616]  from: [<ffffffff81373425>] radix_tree_lookup_element+0x55/0x90
[  106.338450]    to: [<ffffffff81373400>] radix_tree_lookup_element+0x30/0x90
[  106.346274]  from: [<ffffffff81373420>] radix_tree_lookup_element+0x50/0x90
[  106.354108]    to: [<ffffffff8137340b>] radix_tree_lookup_element+0x3b/0x90
[  106.361934]  from: [<ffffffff813733fd>] radix_tree_lookup_element+0x2d/0x90
[  106.369750]    to: [<ffffffff813733d0>] radix_tree_lookup_element+0x0/0x90
[  106.377487]  from: [<ffffffff81373476>] radix_tree_lookup+0x6/0x10
[  106.384447]    to: [<ffffffff81373470>] radix_tree_lookup+0x0/0x10
[  106.391408]  from: [<ffffffff810da4c2>] irq_to_desc+0x12/0x20
[  106.397882] Stack:
[  106.400140]  00000002810bb4f2 ffff88006e07bde8 ffff88006e07bd88 0000000000000000
[  106.408551]  ffffffff8110a801 0000000000000202 ffff880077e9bd40 ffffffff81030f62
[  106.416967]  0000000000000282 ffff880077e9bd58 ffffffff819dd043 0000000000000003
[  106.425375] Call Trace:
[  106.428136]  [<ffffffff8110a801>] ? multi_cpu_stop+0x1/0x110
[  106.434475]  [<ffffffff81030f62>] native_cpu_disable+0x12/0x40
[  106.441018]  [<ffffffff819dd043>] take_cpu_down+0x13/0x40
[  106.447074]  [<ffffffff8110a8c1>] multi_cpu_stop+0xc1/0x110
[  106.453324]  [<ffffffff8110a800>] ? cpu_stop_should_run+0x50/0x50
[  106.460156]  [<ffffffff8110aad8>] cpu_stopper_thread+0x78/0x150
[  106.466795]  [<ffffffff819f2bde>] ? _raw_spin_unlock_irq+0x1e/0x40
[  106.473726]  [<ffffffff810b33d7>] ? finish_task_switch+0x57/0xd0
[  106.480464]  [<ffffffff819edffe>] ? __schedule+0x37e/0x7b0
[  106.486619]  [<ffffffff810b20fd>] smpboot_thread_fn+0x17d/0x2b0
[  106.493259]  [<ffffffff810b1f80>] ? SyS_setgroups+0x160/0x160
[  106.499704]  [<ffffffff810ab1a4>] kthread+0xe4/0x100

We find latest upstream has commit d97eb8966c91f2c9d05f0a22eb89ed5b76d966d1 to
solve this IPANIC. But from the link http://lkml.kernel.org/r/20150204132754.GA10078@suse.de,
it looks the root cause is not clear.

As it's easily to hit with the specific test case, we have more check and find
the IPANIC scenario as below.

      cpu N (N = 1, or 2, or 3)		    	    cpu 0
	native_cpu_up				device_shutdown
	=> do_boot_cpu
	=> start_secondary
		=> smp_callin
		 => setup_vector_irq
		 => __setup_vector_irq
							=> free_msi_irqs
							=> arch_teardown_msi_irqs
							=> default_teardown_msi_irqs
							=> arch_teardown_msi_irq
							=> native_teardown_msi_irq
							=> destroy_irq
							=> __clear_irq_vector
		=> set_cpu_online

The cpu still is not online when clear irq vector, it makes the irq number remain
in irq vector after free_msi_irqs. Next native_cpu_disable() will hit NULL pointer
when check irq vector.

The patch move setup_vector_irq after set_cpu_online.

Signed-off-by: xiao jin <jin.xiao@...el.com>
---
 arch/x86/kernel/smpboot.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 50e547e..f7d5d79 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -172,11 +172,6 @@ static void smp_callin(void)
 	apic_ap_setup();
 
 	/*
-	 * Need to setup vector mappings before we enable interrupts.
-	 */
-	setup_vector_irq(smp_processor_id());
-
-	/*
 	 * Save our processor parameters. Note: this information
 	 * is needed for clock calibration.
 	 */
@@ -257,6 +252,11 @@ static void notrace start_secondary(void *unused)
 	cpu_set_state_online(smp_processor_id());
 	x86_platform.nmi_init();
 
+	/*
+	 * Need to setup vector mappings before we enable interrupts.
+	 */
+	setup_vector_irq(smp_processor_id());
+
 	/* enable local interrupts */
 	local_irq_enable();
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ