lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1332504120.16159.17.camel@twins>
Date:	Fri, 23 Mar 2012 13:02:00 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Liu, Chuansheng" <chuansheng.liu@...el.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Yanmin Zhang <yanmin_zhang@...ux.intel.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
Subject: RE: [PATCH] Fix the race between smp_call_function and CPU booting

On Fri, 2012-03-23 at 11:32 +0000, Liu, Chuansheng wrote:
> In fact, I started two scripts running:
> 1/ One script:
> echo 0 > /sys/devices/system/cpuX/online
> echo 1 > /sys/devices/system/cpuX/online
> Rerunning the above commands in loop
> 
> 2/Another script:
> echo 1 > /debug/smp_call_test
> usleep 50000
> Rerunning the above command in loop
> 
> This race issue can be easy to be reproduced in several minutes;
> For simplify your test as mine(just two CPUs), you can set other non-booting CPUs as offline
> at first and just leave one non-booting CPU.

So this is exactly what I did and it ran for 30+ minutes without fail. I
found I forgot to log the serial output so I just re-ran this to make
sure. 10+ minutes and not a single WARN in the console output.

If I pop my change to select_fallback_rq() I can indeed trigger this:

------------[ cut here ]------------
WARNING: at /usr/src/linux-2.6/arch/x86/kernel/smp.c:120
native_smp_send_reschedule+0x5b/0x60()
Hardware name: X8DTN
Modules linked in: [last unloaded: scsi_wait_scan]
Pid: 1542, comm: abrtd Not tainted 3.3.0-01725-gd6eb054-dirty #63
Call Trace:
 <IRQ>  [<ffffffff810775df>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8107763a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff8105f79b>] native_smp_send_reschedule+0x5b/0x60
 [<ffffffff810aa67a>] try_to_wake_up+0x1fa/0x2c0
 [<ffffffff810acaec>] ? sched_slice.isra.38+0x5c/0x90
 [<ffffffff810aa795>] wake_up_process+0x15/0x20
 [<ffffffff81085c6e>] process_timeout+0xe/0x10
 [<ffffffff81086cb3>] run_timer_softirq+0x143/0x460
 [<ffffffff81384a94>] ? timerqueue_add+0x74/0xc0
 [<ffffffff81085c60>] ? usleep_range+0x50/0x50
 [<ffffffff8107e81d>] __do_softirq+0xbd/0x290
 [<ffffffff810c5e64>] ? clockevents_program_event+0x74/0x100
 [<ffffffff810c72d4>] ? tick_program_event+0x24/0x30
 [<ffffffff8194ba4c>] call_softirq+0x1c/0x30
 [<ffffffff810433d5>] do_softirq+0x55/0x90
 [<ffffffff8107ed2e>] irq_exit+0x9e/0xe0
 [<ffffffff8194c07e>] smp_apic_timer_interrupt+0x6e/0x99
 [<ffffffff8194b107>] apic_timer_interrupt+0x67/0x70
 <EOI> 
---[ end trace d2b2cbf78c1ddd2e ]---


But let me re-run with the select_fallback_rq() change and let it run
for several hours while I go play outside.. 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ