[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1204555505.3842.4.camel@yangyi-dev.bj.intel.com>
Date: Mon, 03 Mar 2008 22:45:04 +0800
From: Yi Yang <yi.y.yang@...el.com>
To: ego@...ibm.com
Cc: Ingo Molnar <mingo@...e.hu>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...sign.ru>,
"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes are
dealocked when cpu is set to offline
On Mon, 2008-03-03 at 21:01 +0530, Gautham R Shenoy wrote:
> > This issue seems such one, but i tried to change it to follow this rule but
> > the issue is still there.
> >
> > Why isn't the kernel thread [watchdog/1] reaped by its parent? its state
> > is TASK_RUNNING with high priority (R< means this), why it isn't done?
> >
> > Anyone ever met such a problem? Your thought?
>
> Hi Yi,
>
> This is indeed strange. I am able to reproduce this problem on my 4-way
> box. From what I see in the past two runs, we're waiting in the
> cpu-hotplug callback path for the watchdog/1 thread to stop.
>
> During cpu-offline, once the cpu goes offline, in the migration_call(),
> we migrate any tasks associated with the offline cpus
> to some other cpu. This also mean breaking affinity for tasks which were
> affined to the cpu which went down. So watchdog/1 has been migrated to
> some other cpu.
No, [watchdog/1] is just for CPU #1, if CPU #1 has been offline, it
should be killed but not migrated to other CPU because other CPU has
such a kthread.
Maybe migration_call was doing such a bad thing. :-)
>
> However, it remains in R< state and has not executed the
> kthread_should_stop() instruction.
>
> I'm trying to probe further by inserting a few more printk's in there.
>
> Will post the findings in a couple of hours.
>
> Thanks for reporting the problem.
>
> Regards
> gautham.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists