[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1204565855.3842.22.camel@yangyi-dev.bj.intel.com>
Date: Tue, 04 Mar 2008 01:37:35 +0800
From: Yi Yang <yi.y.yang@...el.com>
To: Dmitry Adamushko <dmitry.adamushko@...il.com>
Cc: Ingo Molnar <mingo@...e.hu>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes are
dealocked when cpu is set to offline
On Mon, 2008-03-03 at 22:53 +0800, Yi Yang wrote:
> On Mon, 2008-03-03 at 13:02 +0100, Dmitry Adamushko wrote:
> > On 03/03/2008, Ingo Molnar <mingo@...e.hu> wrote:
> > >
> > > * Dmitry Adamushko <dmitry.adamushko@...il.com> wrote:
> > >
> > > > per_cpu(watchdog_task, hotcpu) = NULL;
> > > > + mlseep(1);
> > >
> > >
> > > that wont build very well ...
> >
> > yeah, I forgot to mention that it's not even compile-tested :-/
> > I re-created it from scratch instead of looking for the original one.
> >
> > please, this one (again, not compile-tested)
> >
> > --- softlockup-prev-2.c 2008-03-03 12:38:36.000000000 +0100
> > +++ softlockup.c 2008-03-03 13:00:20.000000000 +0100
> > @@ -294,6 +294,7 @@ cpu_callback(struct notifier_block *nfb,
> > case CPU_DEAD_FROZEN:
> > p = per_cpu(watchdog_task, hotcpu);
> > per_cpu(watchdog_task, hotcpu) = NULL;
> > + msleep(1);
> > kthread_stop(p);
> > break;
> > #endif /* CONFIG_HOTPLUG_CPU */
>
> I don't think it can fix this issue, it only gives one chance to
> scheduler, i think there are another potential and very serious issues
> inside of scheduler or locking or what else we don't know.
>
> Maybe migration is a doubtful point as Gautham mentioned.
That issue is still there after the above patch is applied.
I found that [watchdog/#] is indeed migrated to other cpu because
migration_call is called before cpu_callback, i think this is the real
root cause very very possibly.
I suggest we can develop a new notifier infrastructure in which one
caller can specify whether it is kthread_stopping a cpu-bind kthread
so that such notifier callbacks can be executed prior to other
callbacks.
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists