linux-kernel - Re: [OOPS] [XEN] OOPS early after boot on master

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <3e8340490906111555r313ff6e9nce956fc76eff7382@mail.gmail.com>
Date:	Thu, 11 Jun 2009 18:55:34 -0400
From:	Bryan Donlan <bdonlan@...il.com>
To:	Jeremy Fitzhardinge <jeremy@...p.org>,
	LKML <linux-kernel@...r.kernel.org>,
	xen-devel@...ts.xensource.com
Subject: Re: [OOPS] [XEN] OOPS early after boot on master

On Thu, Jun 11, 2009 at 5:16 PM, Jeremy Fitzhardinge<jeremy@...p.org> wrote:
> On 06/08/09 13:05, Bryan Donlan wrote:
>>
>> On Sun, Jun 7, 2009 at 1:10 PM, Bryan Donlan<bdonlan@...il.com>  wrote:
>>
>>>
>>> Shortly after boot, I got this OOPS:
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG at kernel/sched.c:1209!
>>> invalid opcode: 0000 [#1] SMP
>>> last sysfs file: /sys/block/md0/dev
>>> Modules linked in:
>>>
>>> Pid: 1312, comm: khelper Not tainted (2.6.30-rc8 #1)
>>> EIP: 0061:[<c011e3a9>] EFLAGS: 00010046 CPU: 3
>>> EIP is at resched_task+0x69/0x70
>>> EAX: 00000000 EBX: c05c5660 ECX: 00000000 EDX: 00000002
>>> ESI: d60bb810 EDI: d7026600 EBP: 00000001 ESP: d5b1dee0
>>>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>>> Process khelper (pid: 1312, ti=d5b1c000 task=d6138420 task.ti=d5b1c000)
>>> Stack:
>>>  c05c5660 d605f810 c0125aa0 00000000 00000000 00000000 00000000 d6075eb8
>>>  d6075ef4 00000001 00000001 c011f7c3 00000000 00000003 d6075f00 d6075ef8
>>>  d6075efc 00000200 00000000 c011fea0 00000000 00000000 d6138420 d6075ef8
>>> Call Trace:
>>>  [<c0125aa0>] ? try_to_wake_up+0xa0/0x1d0
>>>  [<c011f7c3>] ? __wake_up_common+0x43/0x70
>>>  [<c011fea0>] ? complete+0x40/0x60
>>>  [<c0128c10>] ? mm_release+0x40/0xc0
>>>  [<c01051de>] ? __raw_callee_save_xen_restore_fl+0x6/0x8
>>>  [<c05c1a2e>] ? _spin_unlock_irqrestore+0x1e/0x30
>>>  [<c012c3e6>] ? exit_mm+0x16/0x110
>>>  [<c01051ee>] ? __raw_callee_save_xen_irq_enable+0x6/0x8
>>>  [<c012df2e>] ? do_exit+0xfe/0x6d0
>>>  [<c05c007b>] ? schedule_timeout+0x10b/0x150
>>>  [<c010bacc>] ? kernel_execve+0x1c/0x30
>>>  [<c013b550>] ? ____call_usermodehelper+0x0/0x130
>>>  [<c013b67b>] ? ____call_usermodehelper+0x12b/0x130
>>>  [<c013b550>] ? ____call_usermodehelper+0x0/0x130
>>>  [<c01087d7>] ? kernel_thread_helper+0x7/0x10
>>> Code: c2 74 0e 0f ae f0 89 f6 8b 46 04 f6 40 0c 04 74 09 5b 5e c3 8d
>>> b6 00 00 00 00 89 d0 ff 15 f0 2e 6f c0 5b 5e 8d b6 00 00 00 00 c3<0f>
>>> 0b eb fe 8d 76 00 53 89 c3 8b 0c 85 a0 b6 73 c0 ba 00 76 7a
>>> EIP: [<c011e3a9>] resched_task+0x69/0x70 SS:ESP 0069:d5b1dee0
>>> ---[ end trace 155a42330fa44f01 ]---
>>> Fixing recursive fault but reboot is needed!
>>>
>>> This occurs under i386, with commit 81ee1ba; x86_64 does not (seem to)
>>> have this issue. I'll try to bisect this shortly.
>>>
>>
>> Still working on the actual bisection, but the OOPS only occurs with
>> CONFIG_PARAVIRT_SPINLOCKS enabled.
>>
>
> Thanks for the report.  I haven't had a chance to look at it in detail, but
> its interesting that it appears to be pv spinlocks...

On further analysis, it seems that that's a red herring - disabling PV
spinlocks just makes it occur less often, I think... I'm currently
still bisecting it; it's complicated by other OOPS-causing bugs having
existed in the interim, but it definitely existed before the
introduction of CONFIG_PARAVIRT_SPINLOCKS.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/