[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131112095516.GB32441@localhost>
Date: Tue, 12 Nov 2013 17:55:16 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Michael wang <wangyun@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org
Subject: Re: [sched/get_online_cpus] INFO: task swapper/0:1 blocked for more
than 120 seconds.
On Mon, Nov 11, 2013 at 05:20:22PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 11, 2013 at 03:47:11PM +0800, Michael wang wrote:
> > Hi, Fengguang
> >
> > On 11/10/2013 06:16 PM, Fengguang Wu wrote:
> > > Greetings,
> > >
> > > I got the below dmesg and the first bad commit is
> >
> > I guess this will disappear when '!CONFIG_RCU_BOOST'...
> >
> > AFAIK, if the rsp was in boost mode, we count on smpboot-thread
> > 'rcu_cpu_thread_spec' to finish the callback, which will be
> > parked before do sync-rcu inside _cpu_down(), if that was true,
> > then the sync will never finish...
> >
> > May be some brainless fix like this?
> >
> >
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 63aa50d..aa24338 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -306,7 +306,6 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
> > __func__, cpu);
> > goto out_release;
> > }
> > - smpboot_park_threads(cpu);
> >
> > /*
> > * By now we've cleared cpu_active_mask, wait for all preempt-disabled
> > @@ -321,6 +320,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen)
> > #endif
> > synchronize_rcu();
> >
> > + smpboot_park_threads(cpu);
> > +
> > /*
> > * So now all preempt/rcu users must observe !cpu_active().
> > */
>
> Good thinking.. Wu did this cure stuff?
Yes, it fixed the problem.
Tested-by: Fengguang Wu <fengguang.wu@...el.com>
/kernel/i386-randconfig-j3-11101308/484f4e66a6a1102edf02407479f6f7632aade0f3
+--------------------------------------------------+--------------+--------------+
| | e5137b50a064 | 484f4e66a6a1 |
+--------------------------------------------------+--------------+--------------+
| boot_successes | 42 | 100 |
| boot_failures | 58 | |
| INFO:task_blocked_for_more_than_seconds | 58 | |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 58 | |
+--------------------------------------------------+--------------+--------------+
/kernel/x86_64-randconfig-x4-1108/484f4e66a6a1102edf02407479f6f7632aade0f3
+------------------------------------------------------------------------------------+-----------+--------------+--------------+
| | v3.12-rc7 | e5137b50a064 | 484f4e66a6a1 |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+
| boot_successes | 59 | 34 | 100 |
| has_kernel_error_warning | 4 | | |
| BUG:kernel_early_hang_without_any_printk_output | 4 | | |
| boot_failures | 0 | 66 | |
| INFO:task_blocked_for_more_than_seconds | 0 | 66 | |
| INFO:NMI_handler(arch_trigger_all_cpu_backtrace_handler)took_too_long_to_run:msecs | 0 | 55 | |
| Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 66 | |
+------------------------------------------------------------------------------------+-----------+--------------+--------------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists