[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9cf3eef6-79d9-4969-be94-e5089a0d625b@suse.com>
Date: Mon, 1 Jul 2024 17:31:48 +0300
From: Nikolay Borisov <nik.borisov@...e.com>
To: Jacob Pan <jacob.jun.pan@...ux.intel.com>, X86 Kernel <x86@...nel.org>,
Sean Christopherson <seanjc@...gle.com>, LKML
<linux-kernel@...r.kernel.org>, Thomas Gleixner <tglx@...utronix.de>,
Dave Hansen <dave.hansen@...el.com>, "H. Peter Anvin" <hpa@...or.com>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Xin Li <xin3.li@...el.com>, linux-perf-users@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Tony Luck <tony.luck@...el.com>,
Andy Lutomirski <luto@...nel.org>, acme@...nel.org,
kan.liang@...ux.intel.com, Andi Kleen <andi.kleen@...el.com>,
"Mehta, Sohil" <sohil.mehta@...el.com>
Subject: Re: [PATCH v3 05/11] x86/irq: Process nmi sources in NMI handler
On 28.06.24 г. 23:18 ч., Jacob Pan wrote:
> With NMI source reporting enabled, NMI handler can prioritize the
> handling of sources reported explicitly. If the source is unknown, then
> resume the existing processing flow. i.e. invoke all NMI handlers.
>
> Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>
>
> ---
> v3:
> - Use a static flag to disable NMIs in case of HW failure
> - Optimize the case when unknown NMIs are mixed with known NMIs(HPA)
> v2:
> - Disable NMI source reporting once garbage data is given in FRED
> return stack. (HPA)
> ---
> arch/x86/kernel/nmi.c | 73 +++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 70 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 639a34e78bc9..c3a10af7f26b 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -149,23 +149,90 @@ static inline int do_handle_nmi(struct nmiaction *a, struct pt_regs *regs, unsig
> return thishandled;
> }
>
> +static int nmi_handle_src(unsigned int type, struct pt_regs *regs, unsigned long *handled_mask)
> +{
> + static bool nmi_source_disabled;
> + bool has_unknown_src = false;
> + unsigned long source_bitmask;
> + struct nmiaction *a;
> + int handled = 0;
> + int vec = 1;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) ||
> + type != NMI_LOCAL || nmi_source_disabled)
> + return 0;
> +
> + source_bitmask = fred_event_data(regs);
> + if (!source_bitmask) {
> + pr_warn("NMI received without source information! Disable source reporting.\n");
> + nmi_source_disabled = true;
> + return 0;
> + }
> +
> + /*
> + * Per NMI source specification, there is no guarantee that a valid
> + * NMI vector is always delivered, even when the source specified
> + * one. It is software's responsibility to check all available NMI
> + * sources when bit 0 is set in the NMI source bitmap. i.e. we have
> + * to call every handler as if we have no NMI source.
> + * On the other hand, if we do get non-zero vectors, we know exactly
> + * what the sources are. So we only call the handlers with the bit set.
> + */
> + if (source_bitmask & BIT(NMI_SOURCE_VEC_UNKNOWN)) {
> + pr_warn_ratelimited("NMI received with unknown source\n");
> + has_unknown_src = true;
> + }
> +
> + rcu_read_lock();
> + /* Bit 0 is for unknown NMI sources, skip it. */
> + for_each_set_bit_from(vec, &source_bitmask, NR_NMI_SOURCE_VECTORS) {
> + a = rcu_dereference(nmiaction_src_table[vec]);
> + if (!a) {
> + pr_warn_ratelimited("NMI received %d no handler", vec);
> + continue;
> + }
> + handled += do_handle_nmi(a, regs, type);
> + /*
> + * Needs polling if unknown source bit is set, handled_mask is
> + * used to tell the polling code which NMIs can be skipped.
> + */
> + if (has_unknown_src)
> + *handled_mask |= BIT(vec);
> + }
> + rcu_read_unlock();
> +
> + return handled;
> +}
> +
> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> {
> struct nmi_desc *desc = nmi_to_desc(type);
> + unsigned long handled_mask = 0;
> struct nmiaction *a;
> int handled=0;
>
> - rcu_read_lock();
> + /*
> + * Check if the NMI source handling is complete, otherwise polling is
> + * still required. handled_mask is non-zero if NMI source handling is
> + * partial due to unknown NMI sources.
> + */
> + handled = nmi_handle_src(type, regs, &handled_mask);
> + if (handled && !handled_mask)
> + return handled;
How about renaming handled_mask to "partial_handled_mask" ? Because in
addition to it being a mask it's also used as a boolean to signal that
an unknown NMI source was encountered.
>
> + rcu_read_lock();
> /*
> * NMIs are edge-triggered, which means if you have enough
> * of them concurrently, you can lose some because only one
> * can be latched at any given time. Walk the whole list
> * to handle those situations.
> */
> - list_for_each_entry_rcu(a, &desc->head, list)
> + list_for_each_entry_rcu(a, &desc->head, list) {
> + /* Skip NMIs handled earlier with source info */
> + if (BIT(a->source_vec) & handled_mask)
> + continue;
> handled += do_handle_nmi(a, regs, type);
> -
> + }
> rcu_read_unlock();
>
> /* return total number of NMI events handled */
Powered by blists - more mailing lists