lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 30 Jan 2009 08:40:51 -0800
From:	Jeremy Fitzhardinge <jeremy@...p.org>
To:	Ingo Molnar <mingo@...e.hu>
CC:	paulmck@...ux.vnet.ibm.com,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Tejun Heo <tj@...nel.org>, Yinghai Lu <yinghai@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Brian Gerst <brgerst@...il.com>
Subject: Re: Seeing "huh, entered softirq 8 ffffffff802682aa preempt_count
 00000100, exited with 00010100?" in tip.git

Ingo Molnar wrote:
> * Jeremy Fitzhardinge <jeremy@...p.org> wrote:
>
>   
>> Paul E. McKenney wrote:
>>     
>>> On Thu, Jan 29, 2009 at 02:12:31PM -0800, Jeremy Fitzhardinge wrote:
>>>   
>>>       
>>>> When I boot my x86-64 tip.git kernel under Xen, I'm seeing:
>>>>
>>>> pnp: PnP ACPI: disabled
>>>> NET: Registered protocol family 2
>>>> huh, entered softirq 8 ffffffff802682aa preempt_count 00000100, 
>>>> exited with 00010100?
>>>> IP route cache hash table entries: 512 (order: 0, 4096 bytes)
>>>> TCP established hash table entries: 1024 (order: 2, 16384 bytes)
>>>> [...]
>>>> XENBUS: Device with no driver: device/console/0
>>>> Freeing unused kernel memory: 372k freed
>>>> Red Hat nash version 6.0.19 starting
>>>> Mounting proc filesystem
>>>> Mounting sysfs filesystem
>>>> Creating /dev
>>>> BUG: scheduling while atomic: swapper/0/0x00010000
>>>> Modules linked in:
>>>> Pid: 0, comm: swapper Not tainted 2.6.29-rc3-tip #481
>>>> Call Trace:
>>>> [<ffffffff80238c1f>] __schedule_bug+0x62/0x66
>>>> [<ffffffff80211d2d>] ? retint_restore_args+0x5/0x20
>>>> [<ffffffff80503921>] __schedule+0x95/0x792
>>>> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
>>>> [<ffffffff802093aa>] ? _stext+0x3aa/0x1000
>>>> [<ffffffff805040c2>] schedule+0xe/0x22
>>>> [<ffffffff8020ff04>] cpu_idle+0x70/0x72
>>>> [<ffffffff804fc3a0>] cpu_bringup_and_idle+0x13/0x15
>>>> Creating initial device nodes
>>>> Setting up hotplug.
>>>>
>>>>
>>>> From what I can see, softirq 8 is the RCU softirq.  I don't know if 
>>>> the "scheduling while atomic" is related or not, but its two new 
>>>> schedulerish symptoms appearing at once, so I think its likely 
>>>> they're related.
>>>>     
>>>>         
>>> Hmmm...  Mysterious, as you seem to be using classic RCU, which hasn't
>>> changed in awhile.  Which branch of the tip tree are you using?
>>>   
>>>       
>> tip/master.  It looks like this appeared since -rc1.  Mu current  
>> suspicion is the percpu changes, since I'm seeing some other strange  
>> symptoms.
>>     
>
> Cc:-ed more folks - it's either the percpu changes or the APIC changes 
> (both occured at about the same time). Or maybe something from upstream.
>   

In my case I'm only seeing it when running under Xen, so I don't think 
this particular problem is apic related.  My current theory - dreamed up 
last night and completely untested/uninvestigated - is that interrupt 
masking isn't working.  In Xen, interrupts are masked via a per-cpu 
piece of memory shared with the hypervisor, so if the percpu setup is 
wrong in some way, the kernel and Xen could be using different memory 
for the mask, with hilarious results.

    J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ