[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4fad5d31-0e04-2a1e-68e6-5512f8dd93bf@redhat.com>
Date: Mon, 18 Jun 2018 15:14:50 -0400
From: David Arcari <darcari@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Donald Zickus <dzickus@...hat.com>,
Prarit Bhargava <prarit@...hat.com>,
Jerry Hoemann <jerry.hoemann@....com>
Subject: Re: [PATCH] perf/x86: read the FREEZE_WHILE_SMM bit during boot
On 06/12/2018 12:56 PM, Peter Zijlstra wrote:
>
>> Ultimately, my solution was to restore the previous behavior by reading and
>> storing the firmware setting of the bit rather than to always clear it.
>
> Ah, urgh.. what a mess. So the OS setting the bit to a known and
> consistent value is 'good' IMO. The firmware magically frobbing things
> is 'bad'.
I had actually considered changing the code to enable the FREEZE_WHILE_SMM by
default, but decided against this approach as I was concerned that setting the
bit on a system where it is initially cleared by the firmware could also have
negative side effects.
>
> Now, explain to me why an IO-check results in an external NMI, and why
> there are long running SMI handlers around? Why can't the IO error not
> be propagated through the regular device interrupt/state? Why are long
> running SMIs required at all, ever? Why doesn't the OS handler whatever
> it is the SMM does?
>
> Are you not solving the wrong problem here?
>
I didn't think so.
1) This functionality was working reasonably well before this commit was
introduced into the OS.
2) As discussed, the problem cannot be addressed in the NMI handler.
3) The work around that I proposed is quite unobtrusive. You are correct in
this is not an actual "fix" since the problem will be present if the user
decides to change the setting of FREEZE_WHILE_SMM via sysfs, but at least
external NMIs are functional by default.
IIUC, you are proposing a complete rewrite of the external NMI infrastructure
along with modification to system firmware. I also believe this would be
somewhat problematic as an external NMI would not function when interrupts are
disabled.
Is there an alternate solution that could provide relief in the short term? I
think that what I have proposed accomplishes this, but perhaps there is a better
more palatable alternative.
Powered by blists - more mailing lists