lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1204565855.3842.22.camel@yangyi-dev.bj.intel.com>
Date:	Tue, 04 Mar 2008 01:37:35 +0800
From:	Yi Yang <yi.y.yang@...el.com>
To:	Dmitry Adamushko <dmitry.adamushko@...il.com>
Cc:	Ingo Molnar <mingo@...e.hu>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes are
	dealocked when cpu is set to offline

On Mon, 2008-03-03 at 22:53 +0800, Yi Yang wrote:
> On Mon, 2008-03-03 at 13:02 +0100, Dmitry Adamushko wrote:
> > On 03/03/2008, Ingo Molnar <mingo@...e.hu> wrote:
> > >
> > >  * Dmitry Adamushko <dmitry.adamushko@...il.com> wrote:
> > >
> > >  >                 per_cpu(watchdog_task, hotcpu) = NULL;
> > >  > +               mlseep(1);
> > >
> > >
> > > that wont build very well ...
> > 
> > yeah, I forgot to mention that it's not even compile-tested :-/
> > I re-created it from scratch instead of looking for the original one.
> > 
> > please, this one (again, not compile-tested)
> > 
> > --- softlockup-prev-2.c 2008-03-03 12:38:36.000000000 +0100
> > +++ softlockup.c        2008-03-03 13:00:20.000000000 +0100
> > @@ -294,6 +294,7 @@ cpu_callback(struct notifier_block *nfb,
> >         case CPU_DEAD_FROZEN:
> >                 p = per_cpu(watchdog_task, hotcpu);
> >                 per_cpu(watchdog_task, hotcpu) = NULL;
> > +               msleep(1);
> >                 kthread_stop(p);
> >                 break;
> >  #endif /* CONFIG_HOTPLUG_CPU */
> 
> I don't think it can fix this issue, it only gives one chance to
> scheduler, i think there are another potential and very serious issues
> inside of scheduler or locking or what else we don't know.
> 
> Maybe migration is a doubtful point as Gautham mentioned.

That issue is still there after the above patch is applied.

I found that [watchdog/#] is indeed migrated to other cpu because
migration_call is called before cpu_callback, i think this is the real
root cause very very possibly.

I suggest we can develop a new notifier infrastructure in which one
caller can specify whether it is kthread_stopping a cpu-bind kthread
so that such notifier callbacks can be executed prior to other
callbacks.
> > 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ