lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 05 Jan 2010 18:18:16 +0800
From:	Xiaotian Feng <dfeng@...hat.com>
To:	Marc Dionne <marc.c.dionne@...il.com>
CC:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: BUG during shutdown - bisected to commit e2912009

On 01/05/2010 11:23 AM, Marc Dionne wrote:
> On Mon, Jan 4, 2010 at 9:56 PM, Xiaotian Feng<dfeng@...hat.com>  wrote:
>> On 01/05/2010 02:43 AM, Marc Dionne wrote:
>>>
>>> On Fri, Jan 1, 2010 at 7:42 PM, Peter Zijlstra<peterz@...radead.org>
>>>   wrote:
>>>>
>>>> On Fri, 2010-01-01 at 19:27 -0500, Marc Dionne wrote:
>>>>>
>>>>> I'm getting a BUG with current kernels from
>>>>> kernel/time/clockevents.c:263 when halting the system - a restart
>>>>> behaves normally.  I don't have a good camera handy at the moment to
>>>>> capture the call stack on screen, but the call sequence is:
>>>>>
>>>>> clockevents_notify
>>>>> hrtimer_cpu_notify
>>>>> notifier_call_chain
>>>>> raw_notifier_call_chain
>>>>> _cpu_down
>>>>> disable_nonboot_cpus
>>>>> kernel_power_off
>>>>> sys_reboot
>>>>>
>>>>> I bisected it down to commit e2912009: sched: Ensure set_task_cpu() is
>>>>> never called on blocked tasks.  There were a few commits tested along
>>>>> the way where I got a freeze (with the power still on) instead of a
>>>>> BUG. Reverting that commit from the current kernel doesn't look
>>>>> trivial, but the commit immediately preceding this one does halt fine.
>>>>
>>>> We somehow seem to trip up the below patch, which doesn't really make
>>>> sense, as I can't find how task placement would affect the below error.
>>>>
>>>> It seems to purely test against the hot-unplugged cpu, not a cpu the
>>>> task is running on.
>>>>
>>>> ---
>>>> commit bb6eddf7676e1c1f3e637aa93c5224488d99036f
>>>> Author: Thomas Gleixner<tglx@...utronix.de>
>>>> Date:   Thu Dec 10 15:35:10 2009 +0100
>>>
>>> Probably predictable but worth testing, reverting that patch does
>>> allow my system to shutdown cleanly.
>>
>> That BUG_ON was removed by reverting that patch, so you can shutdown
>> cleanly.
>>
>> Could you please attach you kernel config file? I'm a little confused about
>> how do you revert e2912009, manually? I can't see any connections between
>> e2912009 and bb6eddf7, could you please show me your timer list (cat
>> /proc/timer_list)
>
> config is attached, and the output of cat /proc/timers is also
> attached (it's rather large).
>
> To recap:
> - Reverting bb6eddf7 gives me a clean shutdown - predictable of course
> since it removes the BUG_ON
> - I wasn't able to trivially revert e2912009 from a current kernel.
> But it fails to shutdown while the preceding commit is OK.
>
> So it would seem that e2912009 is triggering something that the check
> in bb6eddf7 is catching.
>
> With more recent kernels (but not the ones around e2912009), I do get
> these timer-related warnings in dmesg (and briefly on screen) :
>
> PCSP: Timer resolution is not sufficient (999848nS)
> PCSP: Make sure you have HPET and ACPI enabled.
> PCSP: Turned into nopcm mode.
>
This is outputed by sound module, but it will not affect clockevents, 
could you please try following patch and let me know the output before 
BUG_ON happens? We can gather more information on the BUG_ON. Thank you.

diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 6f740d9..7c945e8 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -260,6 +260,9 @@ void clockevents_notify(unsigned long reason, void *arg)
                 list_for_each_entry_safe(dev, tmp, &clockevent_devices, 
list) {
                         if (cpumask_test_cpu(cpu, dev->cpumask) &&
                             cpumask_weight(dev->cpumask) == 1) {
+                               if (dev->mode != CLOCK_EVT_MODE_UNUSED)
+                                       printk("invalid dev %s mode %d 
on cpu %d\n", dev->name,
+                                               dev->mode, cpu);
                                 BUG_ON(dev->mode != CLOCK_EVT_MODE_UNUSED);
                                 list_del(&dev->list);

> Marc

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists