[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B4311E8.1000001@redhat.com>
Date: Tue, 05 Jan 2010 18:18:16 +0800
From: Xiaotian Feng <dfeng@...hat.com>
To: Marc Dionne <marc.c.dionne@...il.com>
CC: Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: BUG during shutdown - bisected to commit e2912009
On 01/05/2010 11:23 AM, Marc Dionne wrote:
> On Mon, Jan 4, 2010 at 9:56 PM, Xiaotian Feng<dfeng@...hat.com> wrote:
>> On 01/05/2010 02:43 AM, Marc Dionne wrote:
>>>
>>> On Fri, Jan 1, 2010 at 7:42 PM, Peter Zijlstra<peterz@...radead.org>
>>> wrote:
>>>>
>>>> On Fri, 2010-01-01 at 19:27 -0500, Marc Dionne wrote:
>>>>>
>>>>> I'm getting a BUG with current kernels from
>>>>> kernel/time/clockevents.c:263 when halting the system - a restart
>>>>> behaves normally. I don't have a good camera handy at the moment to
>>>>> capture the call stack on screen, but the call sequence is:
>>>>>
>>>>> clockevents_notify
>>>>> hrtimer_cpu_notify
>>>>> notifier_call_chain
>>>>> raw_notifier_call_chain
>>>>> _cpu_down
>>>>> disable_nonboot_cpus
>>>>> kernel_power_off
>>>>> sys_reboot
>>>>>
>>>>> I bisected it down to commit e2912009: sched: Ensure set_task_cpu() is
>>>>> never called on blocked tasks. There were a few commits tested along
>>>>> the way where I got a freeze (with the power still on) instead of a
>>>>> BUG. Reverting that commit from the current kernel doesn't look
>>>>> trivial, but the commit immediately preceding this one does halt fine.
>>>>
>>>> We somehow seem to trip up the below patch, which doesn't really make
>>>> sense, as I can't find how task placement would affect the below error.
>>>>
>>>> It seems to purely test against the hot-unplugged cpu, not a cpu the
>>>> task is running on.
>>>>
>>>> ---
>>>> commit bb6eddf7676e1c1f3e637aa93c5224488d99036f
>>>> Author: Thomas Gleixner<tglx@...utronix.de>
>>>> Date: Thu Dec 10 15:35:10 2009 +0100
>>>
>>> Probably predictable but worth testing, reverting that patch does
>>> allow my system to shutdown cleanly.
>>
>> That BUG_ON was removed by reverting that patch, so you can shutdown
>> cleanly.
>>
>> Could you please attach you kernel config file? I'm a little confused about
>> how do you revert e2912009, manually? I can't see any connections between
>> e2912009 and bb6eddf7, could you please show me your timer list (cat
>> /proc/timer_list)
>
> config is attached, and the output of cat /proc/timers is also
> attached (it's rather large).
>
> To recap:
> - Reverting bb6eddf7 gives me a clean shutdown - predictable of course
> since it removes the BUG_ON
> - I wasn't able to trivially revert e2912009 from a current kernel.
> But it fails to shutdown while the preceding commit is OK.
>
> So it would seem that e2912009 is triggering something that the check
> in bb6eddf7 is catching.
>
> With more recent kernels (but not the ones around e2912009), I do get
> these timer-related warnings in dmesg (and briefly on screen) :
>
> PCSP: Timer resolution is not sufficient (999848nS)
> PCSP: Make sure you have HPET and ACPI enabled.
> PCSP: Turned into nopcm mode.
>
This is outputed by sound module, but it will not affect clockevents,
could you please try following patch and let me know the output before
BUG_ON happens? We can gather more information on the BUG_ON. Thank you.
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 6f740d9..7c945e8 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -260,6 +260,9 @@ void clockevents_notify(unsigned long reason, void *arg)
list_for_each_entry_safe(dev, tmp, &clockevent_devices,
list) {
if (cpumask_test_cpu(cpu, dev->cpumask) &&
cpumask_weight(dev->cpumask) == 1) {
+ if (dev->mode != CLOCK_EVT_MODE_UNUSED)
+ printk("invalid dev %s mode %d
on cpu %d\n", dev->name,
+ dev->mode, cpu);
BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
list_del(&dev->list);
> Marc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists