[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWimsaqu5gCV5uZV@smile.fi.intel.com>
Date: Thu, 15 Jan 2026 10:34:57 +0200
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Ciju Rajan K <crajank@...dia.com>
Cc: hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com, tglx@...utronix.de,
christophe.jaillet@...adoo.fr, vadimp@...dia.com,
platform-driver-x86@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH platform-next v4 2/2] platform/mellanox: mlxreg-hotplug:
Enabling interrupt storm detection
On Thu, Jan 15, 2026 at 09:49:09AM +0200, Ciju Rajan K wrote:
> This patch enables the interrupt storm detection feature and
> also adds the per device counter for tracking the faulty
> devices. It also masks the faulty devices from generating
> any further interrupts.
>
> Add field for interrupt storm handling.
> Extend structure mlxreg_core_data with the following field:
> 'wmark_cntr' - interrupt storm counter.
>
> Extend structure mlxreg_core_item with the following field:
> 'storming_bits' - interrupt storming bits mask.
...
> +static void mlxreg_hotplug_storm_handler(unsigned int irq, unsigned int freq, void *dev_id)
> +{
> + struct mlxreg_hotplug_priv_data *priv = dev_id;
> + struct mlxreg_core_hotplug_platform_data *pdata;
> + struct mlxreg_core_item *item;
> + struct mlxreg_core_data *data;
> + unsigned long asserted;
> + u32 bit;
> +
> + dev_warn(priv->dev,
> + "Interrupt storm detected on IRQ %u (%u interrupts/sec)",
> + irq, freq);
Below you put long line, here it seems wrapped by 80, why so inconsistent?
Please, choose one style and use it everywhere (inside the same file).
> + pdata = dev_get_platdata(&priv->pdev->dev);
> + item = pdata->items;
> + asserted = item->cache;
> +
> + for_each_set_bit(bit, &asserted, 8) {
> + int pos;
> +
> + pos = mlxreg_hotplug_item_label_index_get(item->mask, bit);
> + if (pos < 0)
> + goto out;
Used only once. Just drop the label and move the related code under the branch.
> + data = item->data + pos;
> + /* Check per device interrupt counter */
> + if (data->wmark_cntr >= MLXREG_HOTPLUG_INTR_FREQ_HZ - 1) {
> + dev_err(priv->dev,
> + "Storming bit %d (label: %s) - interrupt masked permanently. Replace broken HW.",
> + bit, data->label);
> + /* Mark bit as storming. */
> + item->storming_bits |= BIT(bit);
> + }
> + data->wmark_cntr = 0;
> + }
> + return;
> + out:
> + dev_err(priv->dev, "Failed to complete interrupt storm handler\n");
> +}
...
> + /* Register with generic interrupt storm detection */
> + if (!irq_register_storm_detection(priv->irq, MLXREG_HOTPLUG_INTR_FREQ_HZ,
> + mlxreg_hotplug_storm_handler, priv)) {
> + dev_warn(&pdev->dev, "Failed to register generic interrupt storm detection\n");
> + } else {
> + dev_info(&pdev->dev, "Registered generic storm detection for IRQ %d\n", priv->irq);
> + }
Invert the conditional, it will be slightly easier to parse.
...
> struct mlxreg_core_data {
> char label[MLXREG_CORE_LABEL_MAX_SIZE];
> u8 regnum;
> u8 slot;
> u8 secured;
> + unsigned int wmark_cntr;
> };
Have you run `pahole`? No issues / room to improve this layout?
...
> struct mlxreg_core_item {
> struct mlxreg_core_data *data;
> u8 ind;
> u8 inversed;
> u8 health;
> + u32 storming_bits;
> };
Ditto.
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists