[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0gohOwnGqMk86Zyqxn11fxukXifSe=T08n7vrvv5Q4QNw@mail.gmail.com>
Date: Mon, 13 Jan 2020 11:51:54 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Bhaskar Upadhaya <bupadhaya@...vell.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
"open list:EDAC-CORE" <linux-edac@...r.kernel.org>,
Len Brown <lenb@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>, gkulkarni@...vell.com,
Robert Richter <rrichter@...vell.com>,
bhaskar.upadhaya.linux@...il.com
Subject: Re: [PATCH V2] apei/ghes: fix ghes_poll_func by registering in
non-deferrable mode
On Thu, Jan 9, 2020 at 10:50 AM Borislav Petkov <bp@...en8.de> wrote:
>
> On Wed, Jan 08, 2020 at 09:17:38AM -0800, Bhaskar Upadhaya wrote:
> > Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> > because of which it is serviced when the CPU eventually wakes up with a
> > subsequent non-deferrable timer and not at the configured polling interval.
> >
> > For polling mode, the polling interval configured by firmware should not
> > be exceeded as per ACPI_6_3 spec[refer Table 18-394], So Timer need to
> > be configured in non-deferrable mode by removing TIMER_DEFERRABLE flag.
> > With NO_HZ enabled and timer callback being configured in non-deferrable
> > mode, timer callback will get called exactly after polling interval.
> >
> > Definition of poll interval as per spec (referred ACPI 6.3):
> > "Indicates the poll interval in milliseconds OSPM should use to
> > periodically check the error source for the presence of an error
> > condition"
> >
> > We are observing an issue in our ThunderX2 platforms wherein
> > ghes_poll_func is not called within poll interval when timer is
> > configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence
> > we are losing the error records.
> >
> > Impact of removing TIMER_DEFFERABLE flag
> > - With NO_HZ enabled, additional timer ticks and unnecessary wakeups of
> > the cpu happens exactly after polling interval.
> >
> > - If polling interval is too small than polling function will be called
> > too frequently which may stall the cpu.
>
> If that becomes a problem, the polling interval setting should be fixed
> to filter too small values.
>
> Anyway, I went and streamlined your commit message:
>
> apei/ghes: Do not delay GHES polling
>
> Currently, the ghes_poll_func() timer callback is registered with the
> TIMER_DEFERRABLE flag. Thus, it is run when the CPU eventually wakes
> up together with a subsequent non-deferrable timer and not at the precisely
> configured polling interval.
>
> For polling mode, the polling interval configured by firmware should not
> be exceeded according to the ACPI spec 6.3, Table 18-394. The definition
> of the polling interval is:
>
> "Indicates the poll interval in milliseconds OSPM should use to
> periodically check the error source for the presence of an error
> condition."
>
> If this interval is extended due to the timer callback deferring, error
> records can get lost. Which we are observing on our ThunderX2 platforms.
>
> Therefore, remove the TIMER_DEFERRABLE flag so that the timer callback
> executes at the precise interval.
>
> and made it more readable, hopefully.
>
> Rafael, pls fixup when applying.
Done.
> With that:
>
> Acked-by: Borislav Petkov <bp@...e.de>
Thanks!
Powered by blists - more mailing lists