lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 09 Mar 2012 16:52:42 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Jeff Moyer <jmoyer@...hat.com>
CC:	Sasha Levin <levinsasha928@...il.com>,
	Nick Bowler <nbowler@...iptictech.com>,
	linux-kernel@...r.kernel.org
Subject: Re: the maxcpus= boot parameter broke somewhere along the line

On 03/08/2012 12:44 AM, Jeff Moyer wrote:

> "Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com> writes:
> 
>> On 03/06/2012 11:38 PM, Jeff Moyer wrote:
>>
>>> Sasha Levin <levinsasha928@...il.com> writes:
>>>
>>>> I can't reproduce it locally with a 3.3-rc5 kernel.
>>>
>>> First, thanks for looking into it.  I just did a git pull, up to -rc6,
>>> and the problem still persists on my machine.
>>>
>>
>>
>> I tried 3.3-rc4 as well as 3.3-rc6+ (last commit dac12d1). I did not
>> see the problem in either case.
> 
> I bisected the issue, and it landed here:
> 
> 8a25a2fd126c621f44f3aeaef80d51f00fc11639 is the first bad commit
> commit 8a25a2fd126c621f44f3aeaef80d51f00fc11639
> Author: Kay Sievers <kay.sievers@...y.org>
> Date:   Wed Dec 21 14:29:42 2011 -0800
> 
>     cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular
>     subsystem
> 
> Unfortunately, that's a HUGE commit.
> 


This was from your dmesg:

sd 0:0:10:1: [sdk] Attached SCSI disk
readahead: starting
udev: starting version 147
SMP alternatives: switching to SMP code
WARNING! power/level is deprecated; use power/control instead
EDAC MC: Ver: 2.1.0
Booting Node 0 Processor 3 APIC 0x3
smpboot cpu 3: start_ip = 9a000
EDAC MC0: Giving out device to 'i3200_edac' 'i3200': DEV 0000:00:00.0
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0 Processor 2 APIC 0x1
smpboot cpu 2: start_ip = 9a000
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node 0 Processor 1 APIC 0x2
smpboot cpu 1: start_ip = 9a000
NMI watchdog enabled, takes one hw-pmu counter.


Looking at the mention of udev above, and considering the commit you bisected
to, I think it would be good to see whether someone is writing 1 to
/sys/device/system/cpu/cpu*/online and hence the cpus are getting hot-added
towards the end of boot. Maybe that sounds stupid, but worth a try :)

So can you try the debug patch below? It applies on latest linux-3.3-rc6+

---

 drivers/base/cpu.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)


diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4dabf50..49d5f83 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -43,11 +43,13 @@ static ssize_t __ref store_online(struct device *dev,
 	cpu_hotplug_driver_lock();
 	switch (buf[0]) {
 	case '0':
+		printk("CPU %d offline initated from userspace\n", cpu->dev.id);
 		ret = cpu_down(cpu->dev.id);
 		if (!ret)
 			kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
 		break;
 	case '1':
+		printk("CPU %d online initated from userspace\n", cpu->dev.id);
 		ret = cpu_up(cpu->dev.id);
 		if (!ret)
 			kobject_uevent(&dev->kobj, KOBJ_ONLINE);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ