lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120412024651.GA7250@localhost>
Date:	Thu, 12 Apr 2012 10:46:51 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Suresh Siddha <suresh.b.siddha@...el.com>
Cc:	Alex Shi <alex.shi@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, john stultz <johnstul@...ibm.com>,
	venki@...gle.com, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: kernel panic on NHM EX machine

Hi all,

This commit makes a difference. Found by auto bisection and confirmed
by reverting it on top of v3.4-rc2.  Attached is my kconfig.

commit 77b0d60c5adf39c74039e2142a1d3cd1e4d53799
Author: Suresh Siddha <suresh.b.siddha@...el.com>
Date:   Fri Nov 4 17:18:21 2011 -0700

    clockevents: Leave the broadcast device in shutdown mode when not needed
    
    Platforms with Always Running APIC Timer doesn't use the broadcast timer
    but the kernel is leaving the broadcast timer (HPET in this case)
    in oneshot mode.
    
    On these platforms, before the switch to oneshot mode, broadcast device is
    actually in shutdown mode. Code checks for empty tick_broadcast_mask and
    avoids going into the periodic mode.
    
    During switch to oneshot mode, add the same tick_broadcast_mask checks in the
    tick_broadcast_switch_to_oneshot() and avoid the broadcast device going into
    the oneshot mode.
    
    Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
    Cc: john stultz <johnstul@...ibm.com>
    Cc: venki@...gle.com
    Link: http://lkml.kernel.org/r/1320452301.15071.16.camel@sbsiddha-desk.sc.intel.com
    Signed-off-by: Thomas Gleixner <tglx@...utronix.de>

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index fd4a7b1..e883f57 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -575,11 +575,15 @@ void tick_broadcast_switch_to_oneshot(void)
        unsigned long flags;
 
        raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
+       if (cpumask_empty(tick_get_broadcast_mask()))
+               goto end;
 
        tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT;
        bc = tick_broadcast_device.evtdev;
        if (bc)
                tick_broadcast_setup_oneshot(bc);
+
+end:
        raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
 }

Thanks,
Fengguang
---

On Tue, Apr 10, 2012 at 07:35:56AM -0700, Paul E. McKenney wrote:
> On Tue, Apr 10, 2012 at 09:04:52AM +0800, Alex Shi wrote:
> > On 04/10/2012 06:31 AM, Paul E. McKenney wrote:
> > 
> > > On Fri, Apr 06, 2012 at 07:37:13PM +0800, Alex Shi wrote:
> > >> The 3.4-rc1 kernel has a kernel panic in idle booting.
> > >>
> > >> Actually, from 3.3-rc1 kernel we occasionally find this issue may when
> > >> do busy hackbench testing. but from rc1 kernel it will happens on each
> > >> of rebooting.
> > > 
> > > Can't say I have seen anything like this in my own testing, though I
> > > did see significant instability in 3.4-rc1.  However, 3.4-rc2 works
> > > much better for me.  Could you please try it out?
> > > 
> > > 							Thanx, Paul
> > > 
> > 
> > 
> > Ops, saw it again on rc2 kernel booting.
> 
> Hey, I was hoping!
> 
> There have not been any changes to __rcu_pending() itself between
> v3.3-rc1 and v3.4-rc2, so I must confess to be a bit puzzled at
> the difference in reliability.  Would you have any debug symbols
> with which to map the panic back to the source code?  Perhaps gcc
> is aggressively inlining.
> 
> Also, what kind of panic was this?  NULL pointer?  Illegal instruction?
> Something else?
> 
> Given that this now happens on boot, could you please bisect it?
> 
> 						Thanx, Paul
> 
> >  <IRQ>  [<ffffffff810a04bc>] __rcu_pending+0xbd/0x3bf
> >  [<ffffffff810a0a8a>] rcu_check_callbacks+0x69/0xa7
> >  [<ffffffff81045ffb>] update_process_times+0x3a/0x71
> >  [<ffffffff81078e63>] tick_sched_timer+0x6b/0x95
> >  [<ffffffff81056874>] __run_hrtimer+0xb8/0x141
> >  [<ffffffff81078df8>] ? tick_nohz_handler+0xd3/0xd3
> >  [<ffffffff81056f09>] hrtimer_interrupt+0xdb/0x199
> >  [<ffffffff810781aa>] tick_do_broadcast.constprop.3+0x44/0x88
> >  [<ffffffff81078320>] tick_do_periodic_broadcast+0x34/0x3e
> >  [<ffffffff81078339>] tick_handle_periodic_broadcast+0xf/0x40
> >  [<ffffffff810101b4>] timer_interrupt+0x10/0x17
> >  [<ffffffff8109b0c6>] handle_irq_event_percpu+0x5a/0x199
> >  [<ffffffff8109b23c>] handle_irq_event+0x37/0x53
> >  [<ffffffff81028785>] ? ack_apic_edge+0x1f/0x23
> >  [<ffffffff8109d937>] handle_edge_irq+0xa1/0xc8
> >  [<ffffffff8100fb5e>] handle_irq+0x125/0x12e
> >  [<ffffffff8103f9c8>] ? irq_enter+0x13/0x64
> >  [<ffffffff8100f76e>] do_IRQ+0x48/0xa0
> >  [<ffffffff8145b8aa>] common_interrupt+0x6a/0x6a
> >  [<ffffffff81078320>] ? tick_do_periodic_broadcast+0x34/0x3e
> >  [<ffffffff8103ecdb>] ? arch_local_irq_enable+0x8/0xd
> >  [<ffffffff8103f781>] __do_softirq+0x5e/0x182
> >  [<ffffffff81078ed2>] ? update_ts_time_stats+0x2c/0x62
> >  [<ffffffff8106258c>] ? sched_clock_idle_wakeup_event+0x12/0x16
> >  [<ffffffff81462d5c>] call_softirq+0x1c/0x30
> >  [<ffffffff8100fba8>] do_softirq+0x41/0x7d
> >  [<ffffffff8103fa5d>] irq_exit+0x44/0x9c
> >  [<ffffffff81060202>] scheduler_ipi+0x6b/0x6d
> >  [<ffffffff81025dba>] smp_reschedule_interrupt+0x16/0x18
> >  [<ffffffff8146290a>] reschedule_interrupt+0x6a/0x70
> >  <EOI>  [<ffffffff812878ff>] ? arch_local_irq_enable+0x8/0xd
> >  [<ffffffff8106258c>] ? sched_clock_idle_wakeup_event+0x12/0x16
> >  [<ffffffff81288557>] acpi_idle_enter_bm+0x222/0x266
> >  [<ffffffff8138b98b>] cpuidle_enter+0x12/0x14
> >  [<ffffffff8138be61>] cpuidle_idle_call+0xef/0x191
> >  [<ffffffff81015501>] cpu_idle+0x9e/0xe8
> >  [<ffffffff81439c99>] rest_init+0x6d/0x6f
> >  [<ffffffff81ad3b7b>] start_kernel+0x3ad/0x3ba
> >  [<ffffffff81ad34ff>] ? loglevel+0x31/0x31
> >  [<ffffffff81ad32c3>] x86_64_start_reservations+0xae/0xb2
> >  [<ffffffff81ad3140>] ? early_idt_handlers+0x140/0x140
> >  [<ffffffff81ad33c9>] x86_64_start_kernel+0x102/0x111
> > 
> > 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

View attachment ".config" of type "text/plain" (86152 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ