linux-kernel - Re: [PATCH platform-next 2/2] platform/mellanox: mlxreg-hotplug: Add support for handling interrupt storm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <57573eee-8b8f-2399-0723-f6b68c9d779e@linux.intel.com>
Date: Mon, 29 Sep 2025 19:40:21 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Thomas Gleixner <tglx@...utronix.de>, Ciju Rajan K <crajank@...dia.com>
cc: Hans de Goede <hdegoede@...hat.com>, LKML <linux-kernel@...r.kernel.org>, 
    Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, 
    christophe.jaillet@...adoo.fr, platform-driver-x86@...r.kernel.org, 
    vadimp@...dia.com
Subject: Re: [PATCH platform-next 2/2] platform/mellanox: mlxreg-hotplug:
 Add support for handling interrupt storm

On Sat, 27 Sep 2025, Thomas Gleixner wrote:

> On Tue, Sep 16 2025 at 13:00, Ilpo Järvinen wrote:
> > On Tue, 16 Sep 2025, Ciju Rajan K wrote:
> >> This patch provides a mechanism to detect device causing interrupt
> >> flooding and mask interrupt for this specific device, to isolate
> >> from interrupt handling flow. Use the following criteria: if the
> >> specific interrupt was generated 'N' times during 'T' seconds,
> >> such device is to be considered as broken and will be closed for
> >> getting interrupts. User will be notified through the log error
> >> and will be instructed to replace broken device.
> >> 
> >> Reviewed-by: Vadim Pasternak <vadimp@...dia.com>
> >> Signed-off-by: Ciju Rajan K <crajank@...dia.com>
> >
> > I've a general question on this approach, probably more directed towards 
> > Hans, Thomas, or others who might have some insight.
> >
> > Are drivers expected to build their own workarounds for interrupt storms 
> > due to broken HW such as this? It sounds something that should be at least 
> > in part handled by something generic, while the lower-most level masking 
> > and detection might still need to be done by the driver to handle the HW 
> > specific aspects, there seems to be a generic aspect in all this.
> >
> > Is there something generic for this already? If not, should there be 
> > instead of adding this in full into an end-of-the-food-chain driver?
> 
> We don't have a generic mechanism for that. The core handles only the
> case of unhandled runaway interrupts.
> 
> I agree, that there should be some generic infrastructure for that.
> 
> Something like the incomplete below, which obviously needs some thoughts
> vs. shared interrupts and other details, but you get the idea.

Thanks Thomas!

Ciju, please base this storm detection on this kind of approach. I also 
started to wonder could some non-storming bit end up being detect as 
storming because your v3 approach doesn't do counters per asserted bits so 
if unlucky, non-storming bits might be asserted on the very moment the 
detection threshold was crossed and gets marked as storming even if it's 
not?

-- 
 i.

> ---
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1927,6 +1927,10 @@ static struct irqaction *__free_irq(stru
>  		irq_release_resources(desc);
>  		chip_bus_sync_unlock(desc);
>  		irq_remove_timings(desc);
> +		if (desc->irq_storm) {
> +			kfree(desc->irq_storm);
> +			desc->irq_storm = NULL;
> +		}
>  	}
>  
>  	mutex_unlock(&desc->request_mutex);
> --- a/kernel/irq/spurious.c
> +++ b/kernel/irq/spurious.c
> @@ -22,6 +22,44 @@ static DEFINE_TIMER(poll_spurious_irq_ti
>  int irq_poll_cpu;
>  static atomic_t irq_poll_active;
>  
> +/* Runaway interrupt detection */
> +struct irq_storm {
> +	unsigned long		max_cnt;
> +	unsigned long		last_cnt;
> +	irq_storm_cb_t		cb;
> +	void			*cb_arg;
> +};
> +
> +bool irq_register_storm_detection(unsigned int irq, unsigned int max_freq,
> +				  irq_storm_cb_t cb, void *cb_arg)
> +{
> +	struct irq_storm *is;
> +	unsigned long cnt;
> +	bool first;
> +
> +	if (max_freq < 500)
> +		return false;
> +
> +	is = kzalloc(sizeof(*is), GFP_KERNEL);
> +	if (!is)
> +		return false;
> +
> +	/* Adjust to cnt/10ms */
> +	is->max_cnt = max_freq / 100;
> +	is->cb_arg = cb->arg;
> +	is->cb = cb;
> +
> +	scoped_irqdesc_get_and_buslock(irq, 0) {
> +		if (scoped_desc->action) {
> +			is->last_cnt = scoped_desc->tot_cnt;
> +			scoped_desc->irq_storm = is;
> +			return true;
> +		}
> +	}
> +	kfree(is);
> +	return false;
> +}
> +
>  /*
>   * Recovery handler for misrouted interrupts.
>   */
> @@ -217,6 +255,19 @@ static inline bool try_misrouted_irq(uns
>  	return action && (action->flags & IRQF_IRQPOLL);
>  }
>  
> +static void irq_storm_check(struct irq_desc *desc)
> +{
> +	unsigned long delta, now = jiffies;
> +
> +	if (!time_after_irq(now, desc->irq_storm.next_period))
> +		return;
> +	desc->irq_storm.next_period = now + msec_to_jiffies(10);
> +	delta = desc->tot_cnt - desc->irq_storm.last_cnt;
> +	desc->irq_storm.last_cnt = desc->tot_cnt;
> +	if (delta > desc->irq_storm.max_cnt)
> +		desc->irq_storm.cb(irq_desc_get_irq(desc), delta * 100, desc->irq_storm.dev_id);
> +}
> +
>  #define SPURIOUS_DEFERRED	0x80000000
>  
>  void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret)
> @@ -231,6 +282,9 @@ void note_interrupt(struct irq_desc *des
>  		return;
>  	}
>  
> +	if (desc->irq_storm && action_ret == IRQ_HANDLED)
> +		irq_storm_check(desc);
> +
>  	/*
>  	 * We cannot call note_interrupt from the threaded handler
>  	 * because we need to look at the compound of all handlers
>