lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWilWsTaAZeH8x1x@smile.fi.intel.com>
Date: Thu, 15 Jan 2026 10:29:14 +0200
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Ciju Rajan K <crajank@...dia.com>
Cc: hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com, tglx@...utronix.de,
	christophe.jaillet@...adoo.fr, vadimp@...dia.com,
	platform-driver-x86@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH platform-next v4 1/2] kernel/irq: Add generic interrupt
 storm detection mechanism

On Thu, Jan 15, 2026 at 09:49:08AM +0200, Ciju Rajan K wrote:
> If the hardware is broken, it is possible that faulty device will
> flood interrupt handler with false events. For example, if fan or
> power supply has damaged presence pin, it will cause permanent

I would say "has a floating presence pin" as the term floating is
well established and describes the case exactly how you put it further
in the text.

> generation of plugged in / plugged out events. As a result, interrupt
> handler will consume a lot of CPU resources and will keep raising
> "UDEV" events to the user space.
> 
> This patch provides a mechanism for detecting interrupt storm.
> Use the following criteria: if the specific interrupt was generated
> 'N' times during 'T' seconds, such device is to be considered as
> broken and user will be notified through a call back function.
> This feature can be used by any kernel subsystems or drivers.
> 
> The implementation includes:
> 
> - irq_storm_cb_t: Callback function type for storm notifications
> - struct irq_storm: Per-IRQ storm detection data structure
> - irq_register_storm_detection(): Register storm detection with
> 				  configurable parameters
> - irq_unregister_storm_detection(): Unregister storm detection
> - Integration with note_interrupt() for automatic storm checking
> 
> Callback API parameters:
> - irq: interrupt number to monitor
> - max_freq: maximum allowed frequency (interrupts per second)
> - dev_id: device identifier passed to callback

...

> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -20,6 +20,7 @@
>  #include <asm/ptrace.h>
>  #include <asm/irq.h>
>  #include <asm/sections.h>
> +#include <linux/jiffies.h>

I would not mix linux/* group with asm/* group and to me the best location for
this inclusion is just before kref.h. Note, this header needs a bit more of
a cleanup, as it includes basically everything (due to kernel.h and maybe other
messed up headers). But this is out of scope of your series.

...

>  extern int irq_can_set_affinity(unsigned int irq);
>  extern int irq_select_affinity(unsigned int irq);

> +extern bool irq_register_storm_detection(unsigned int irq, unsigned int max_freq,
> +					 irq_storm_cb_t cb, void *dev_id);
> +extern void irq_unregister_storm_detection(unsigned int irq);

Do we still need "extern" keyword?

>  extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m,
>  				     bool setaffinity);

...

> +struct irq_storm {
> +	unsigned long		max_cnt;
> +	unsigned long		last_cnt;
> +	unsigned long		next_period;
> +	irq_storm_cb_t		cb;
> +	void			*dev_id;
> +};

I'm wondering if you have tried to shuffle this layout based on the frequency
of a use of each member. In some cases it might generate less code (can be
measured with bloat-o-meter).

...

> static struct irqaction *__free_irq(struct irq_desc *desc, void *dev_id)

>  		irq_release_resources(desc);
>  		chip_bus_sync_unlock(desc);
>  		irq_remove_timings(desc);

> +		if (desc->irq_storm) {

Unneeded (duplicate) check.

> +			kfree(desc->irq_storm);

> +			desc->irq_storm = NULL;

Do we need this? If so, still can be done unconditionally.

> +		}
>  	}

> +/* Minimum frequency threshold */
> +#define IRQ_STORM_MIN_FREQ_HZ		50
> +#define IRQ_STORM_MAX_FREQ_SCALE	(IRQ_STORM_MIN_FREQ_HZ * 2)

Plain numbers are easier to read, hence 100

> +/* Time window over which storm check is performed */
> +#define IRQ_STORM_PERIOD_WINDOW_MS	(IRQ_STORM_MIN_FREQ_HZ * 20)

MS = HZ?! It's from some other universe with different physics laws.

...

> + * Returns: true on success, false on failure

Are the rest use "Returns" (with "s")? Because the main keyword is "Return".
(Yes, "Returns" works, but it's a secondary one, not even documented IIRC.)

...

> +bool irq_register_storm_detection(unsigned int irq, unsigned int max_freq,
> +				  irq_storm_cb_t cb, void *dev_id)
> +{
> +	struct irq_storm *storm;
> +	bool ret = false;
> +
> +	if (max_freq < IRQ_STORM_MIN_FREQ_HZ || !cb)
> +		return false;
> +
> +	storm = kzalloc(sizeof(*storm), GFP_KERNEL);
> +	if (!storm)
> +		return false;
> +
> +	/* Adjust to count per 10ms */
> +	storm->max_cnt = max_freq / (IRQ_STORM_MAX_FREQ_SCALE);
> +	storm->cb = cb;
> +	storm->dev_id = dev_id;
> +
> +	scoped_irqdesc_get_and_buslock(irq, IRQ_GET_DESC_CHECK_GLOBAL) {
> +		if (scoped_irqdesc->action && !scoped_irqdesc->irq_storm) {
> +			storm->last_cnt = scoped_irqdesc->tot_count;
> +			storm->next_period = jiffies + msecs_to_jiffies(IRQ_STORM_PERIOD_WINDOW_MS);
> +			scoped_irqdesc->irq_storm = storm;
> +			ret = true;
> +		}
> +	}

> +	if (!ret)
> +		kfree(storm);
> +
> +	return ret;

Better to avoid negative conditionals in such contexts.

	if (ret)
		return ret;

	kfree(...);

But it's boolean and assigned inside scoped section. Why we can't return true
directly from scoped section?

> +}

...

> +void irq_unregister_storm_detection(unsigned int irq)
> +{
> +	scoped_irqdesc_get_and_buslock(irq, IRQ_GET_DESC_CHECK_GLOBAL) {

> +		if (scoped_irqdesc->irq_storm) {

Dup check, drop it.

> +			kfree(scoped_irqdesc->irq_storm);
> +			scoped_irqdesc->irq_storm = NULL;
> +		}
> +	}
> +}

...

> +static void irq_storm_check(struct irq_desc *desc)
> +{
> +	struct irq_storm *storm = desc->irq_storm;
> +	unsigned long delta, now = jiffies;
> +
> +	if (!time_after_eq(now, storm->next_period))
> +		return;

> +	storm->next_period = now + msecs_to_jiffies(IRQ_STORM_PERIOD_WINDOW_MS);

Just do

#define IRQ_STORM_PERIOD_WINDOW		msecs_to_jiffies(1000)

It will address my above comment and make this be read better.

> +	delta = desc->tot_count - storm->last_cnt;
> +	storm->last_cnt = desc->tot_count;
> +	if (delta > storm->max_cnt) {
> +		/* Calculate actual frequency: interrupts per second */
> +		storm->cb(irq_desc_get_irq(desc),
> +			(delta * (IRQ_STORM_MAX_FREQ_SCALE)),
> +			storm->dev_id);
> +	}
> +}


-- 
With Best Regards,
Andy Shevchenko



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ