[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27240C0AC20F114CBF8149A2696CBE4A615FC2FF@SHSMSX101.ccr.corp.intel.com>
Date: Mon, 6 Jan 2020 09:22:06 +0000
From: "Liu, Chuansheng" <chuansheng.liu@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Luck, Tony" <tony.luck@...el.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"hpa@...or.com" <hpa@...or.com>
Subject: RE: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
therm_work
> -----Original Message-----
> From: Borislav Petkov <bp@...en8.de>
> Sent: Monday, January 6, 2020 3:08 PM
> To: Liu, Chuansheng <chuansheng.liu@...el.com>
> Cc: linux-kernel@...r.kernel.org; Luck, Tony <tony.luck@...el.com>;
> tglx@...utronix.de; mingo@...hat.com; hpa@...or.com
> Subject: Re: [PATCH] x86/mce/therm_throt: Fix the access of uninitialized
> therm_work
>
> On Mon, Jan 06, 2020 at 06:41:55AM +0000, Chuansheng Liu wrote:
> > In ICL platform, it is easy to hit bootup failure with panic
> > in thermal interrupt handler during early bootup stage.
> >
> > Such issue makes my platform almost can not boot up with
> > latest kernel code.
> >
> > The call stack is like:
> > kernel BUG at kernel/timer/timer.c:1152!
> >
> > Call Trace:
> > __queue_delayed_work
> > queue_delayed_work_on
> > therm_throt_process
> > intel_thermal_interrupt
> > ...
> >
> > When one CPU is up, the irq is enabled prior to CPU UP
> > notification which will then initialize therm_worker.
>
> You mean the unmasking of the thermal vector at the end of
> intel_init_thermal()?
Exactly, and there is one local CPU irq enable later too.
>
> If so, why don't you move that to the end of the notifier and unmask it
> only after all the necessary work like setting up the workqueues etc, is
> done, and save yourself adding yet another silly bool?
>
Thanks for your suggestion, I am just worried about the interrupt delay.
I traced there is about 2s gap between unmask interrupt and workqueue
Initialization. If you think it is OK to ignore this delay, I will make another
simple patch as you suggested😊
Best Regards
Chuansheng
Powered by blists - more mailing lists