[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877d6uref8.ffs@tglx>
Date: Mon, 09 May 2022 16:03:39 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
x86@...nel.org
Cc: Tony Luck <tony.luck@...el.com>, Andi Kleen <ak@...ux.intel.com>,
Stephane Eranian <eranian@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Joerg Roedel <joro@...tes.org>,
Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>,
David Woodhouse <dwmw2@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Nicholas Piggin <npiggin@...il.com>,
"Ravi V. Shankar" <ravi.v.shankar@...el.com>,
Ricardo Neri <ricardo.neri@...el.com>,
iommu@...ts.linux-foundation.org, linuxppc-dev@...ts.ozlabs.org,
linux-kernel@...r.kernel.org,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>
Subject: Re: [PATCH v6 22/29] x86/watchdog/hardlockup: Add an HPET-based
hardlockup detector
On Thu, May 05 2022 at 17:00, Ricardo Neri wrote:
> + if (is_hpet_hld_interrupt(hdata)) {
> + /*
> + * Kick the timer first. If the HPET channel is periodic, it
> + * helps to reduce the delta between the expected TSC value and
> + * its actual value the next time the HPET channel fires.
> + */
> + kick_timer(hdata, !(hdata->has_periodic));
> +
> + if (cpumask_weight(hld_data->monitored_cpumask) > 1) {
> + /*
> + * Since we cannot know the source of an NMI, the best
> + * we can do is to use a flag to indicate to all online
> + * CPUs that they will get an NMI and that the source of
> + * that NMI is the hardlockup detector. Offline CPUs
> + * also receive the NMI but they ignore it.
> + *
> + * Even though we are in NMI context, we have concluded
> + * that the NMI came from the HPET channel assigned to
> + * the detector, an event that is infrequent and only
> + * occurs in the handling CPU. There should not be races
> + * with other NMIs.
> + */
> + cpumask_copy(hld_data->inspect_cpumask,
> + cpu_online_mask);
> +
> + /* If we are here, IPI shorthands are enabled. */
> + apic->send_IPI_allbutself(NMI_VECTOR);
So if the monitored cpumask is a subset of online CPUs, which is the
case when isolation features are enabled, then you still send NMIs to
those isolated CPUs. I'm sure the isolation folks will be enthused.
Thanks,
tglx
Powered by blists - more mailing lists