[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240701083604.5dfeb087@jacob-builder>
Date: Mon, 1 Jul 2024 08:36:04 -0700
From: Jacob Pan <jacob.jun.pan@...ux.intel.com>
To: Nikolay Borisov <nik.borisov@...e.com>
Cc: X86 Kernel <x86@...nel.org>, Sean Christopherson <seanjc@...gle.com>,
LKML <linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...el.com>, "H. Peter Anvin" <hpa@...or.com>, Ingo
Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Xin Li
<xin3.li@...el.com>, linux-perf-users@...r.kernel.org, Peter Zijlstra
<peterz@...radead.org>, Paolo Bonzini <pbonzini@...hat.com>, Tony Luck
<tony.luck@...el.com>, Andy Lutomirski <luto@...nel.org>, acme@...nel.org,
kan.liang@...ux.intel.com, Andi Kleen <andi.kleen@...el.com>, "Mehta,
Sohil" <sohil.mehta@...el.com>, jacob.jun.pan@...ux.intel.com
Subject: Re: [PATCH v3 05/11] x86/irq: Process nmi sources in NMI handler
On Mon, 1 Jul 2024 17:31:48 +0300, Nikolay Borisov <nik.borisov@...e.com>
wrote:
> On 28.06.24 г. 23:18 ч., Jacob Pan wrote:
> > With NMI source reporting enabled, NMI handler can prioritize the
> > handling of sources reported explicitly. If the source is unknown, then
> > resume the existing processing flow. i.e. invoke all NMI handlers.
> >
> > Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>
> >
> > ---
> > v3:
> > - Use a static flag to disable NMIs in case of HW failure
> > - Optimize the case when unknown NMIs are mixed with known NMIs(HPA)
> > v2:
> > - Disable NMI source reporting once garbage data is given in FRED
> > return stack. (HPA)
> > ---
> > arch/x86/kernel/nmi.c | 73 +++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 70 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> > index 639a34e78bc9..c3a10af7f26b 100644
> > --- a/arch/x86/kernel/nmi.c
> > +++ b/arch/x86/kernel/nmi.c
> > @@ -149,23 +149,90 @@ static inline int do_handle_nmi(struct nmiaction
> > *a, struct pt_regs *regs, unsig return thishandled;
> > }
> >
> > +static int nmi_handle_src(unsigned int type, struct pt_regs *regs,
> > unsigned long *handled_mask) +{
> > + static bool nmi_source_disabled;
> > + bool has_unknown_src = false;
> > + unsigned long source_bitmask;
> > + struct nmiaction *a;
> > + int handled = 0;
> > + int vec = 1;
> > +
> > + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) ||
> > + type != NMI_LOCAL || nmi_source_disabled)
> > + return 0;
> > +
> > + source_bitmask = fred_event_data(regs);
> > + if (!source_bitmask) {
> > + pr_warn("NMI received without source information!
> > Disable source reporting.\n");
> > + nmi_source_disabled = true;
> > + return 0;
> > + }
> > +
> > + /*
> > + * Per NMI source specification, there is no guarantee that a
> > valid
> > + * NMI vector is always delivered, even when the source
> > specified
> > + * one. It is software's responsibility to check all available
> > NMI
> > + * sources when bit 0 is set in the NMI source bitmap. i.e. we
> > have
> > + * to call every handler as if we have no NMI source.
> > + * On the other hand, if we do get non-zero vectors, we know
> > exactly
> > + * what the sources are. So we only call the handlers with the
> > bit set.
> > + */
> > + if (source_bitmask & BIT(NMI_SOURCE_VEC_UNKNOWN)) {
> > + pr_warn_ratelimited("NMI received with unknown
> > source\n");
> > + has_unknown_src = true;
> > + }
> > +
> > + rcu_read_lock();
> > + /* Bit 0 is for unknown NMI sources, skip it. */
> > + for_each_set_bit_from(vec, &source_bitmask,
> > NR_NMI_SOURCE_VECTORS) {
> > + a = rcu_dereference(nmiaction_src_table[vec]);
> > + if (!a) {
> > + pr_warn_ratelimited("NMI received %d no
> > handler", vec);
> > + continue;
> > + }
> > + handled += do_handle_nmi(a, regs, type);
> > + /*
> > + * Needs polling if unknown source bit is set,
> > handled_mask is
> > + * used to tell the polling code which NMIs can be
> > skipped.
> > + */
> > + if (has_unknown_src)
> > + *handled_mask |= BIT(vec);
> > + }
> > + rcu_read_unlock();
> > +
> > + return handled;
> > +}
> > +
> > static int nmi_handle(unsigned int type, struct pt_regs *regs)
> > {
> > struct nmi_desc *desc = nmi_to_desc(type);
> > + unsigned long handled_mask = 0;
> > struct nmiaction *a;
> > int handled=0;
> >
> > - rcu_read_lock();
> > + /*
> > + * Check if the NMI source handling is complete, otherwise
> > polling is
> > + * still required. handled_mask is non-zero if NMI source
> > handling is
> > + * partial due to unknown NMI sources.
> > + */
> > + handled = nmi_handle_src(type, regs, &handled_mask);
> > + if (handled && !handled_mask)
> > + return handled;
>
> How about renaming handled_mask to "partial_handled_mask" ? Because in
> addition to it being a mask it's also used as a boolean to signal that
> an unknown NMI source was encountered.
yeah, that is better. will do.
> >
> > + rcu_read_lock();
> > /*
> > * NMIs are edge-triggered, which means if you have enough
> > * of them concurrently, you can lose some because only one
> > * can be latched at any given time. Walk the whole list
> > * to handle those situations.
> > */
> > - list_for_each_entry_rcu(a, &desc->head, list)
> > + list_for_each_entry_rcu(a, &desc->head, list) {
> > + /* Skip NMIs handled earlier with source info */
> > + if (BIT(a->source_vec) & handled_mask)
> > + continue;
> > handled += do_handle_nmi(a, regs, type);
> > -
> > + }
> > rcu_read_unlock();
> >
> > /* return total number of NMI events handled */
Thanks,
Jacob
Powered by blists - more mailing lists