[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F0775371D8AA@SHSMSX103.ccr.corp.intel.com>
Date: Mon, 17 Jul 2017 12:18:45 +0000
From: "Liang, Kan" <kan.liang@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: Don Zickus <dzickus@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...nel.org" <mingo@...nel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"babu.moger@...cle.com" <babu.moger@...cle.com>,
"atomlin@...hat.com" <atomlin@...hat.com>,
"prarit@...hat.com" <prarit@...hat.com>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"eranian@...gle.com" <eranian@...gle.com>,
"acme@...hat.com" <acme@...hat.com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH V2] kernel/watchdog: fix spurious hard lockups
>
> On Mon, 17 Jul 2017, Liang, Kan wrote:
> > There are three proposed patches so far.
> > Patch 1: The patch as above which speed up the hrtimer.
> > Patch 2: Thomas's first proposal.
> > https://patchwork.kernel.org/patch/9803033/
> > https://patchwork.kernel.org/patch/9805903/
> > Patch 3: my original proposal which increase the NMI watchdog timeout
> > by 3X https://patchwork.kernel.org/patch/9802053/
> >
> > According to our test, only patch 3 works well.
> > The other two patches will hang the system eventually.
> > For patch 1, the system hang after running our test case for ~1 hour.
> > For patch 2, the system hang in running the overnight test.
> > There is no error message shown when the system hang. So I don't know
> > the root cause yet.
>
> That doesn't make sense. What's the exact test procedure?
I don't know the exact test procedure. The test case is from our customer.
I only know that the test case makes calls into the x11 libs.
>
> > BTW: We set 1 to watchdog_thresh when we did the test.
> > It's believed that can speed up the failure.
>
> Believe is not really a technical measure....
>
1 is a valid value for watchdog_thresh.
It was set through the standard proc interface.
/proc/sys/kernel/watchdog_thresh
It should not impacts the final test result.
Thanks,
Kan
Powered by blists - more mailing lists