lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251112192408.3646835-4-lrizzo@google.com>
Date: Wed, 12 Nov 2025 19:24:05 +0000
From: Luigi Rizzo <lrizzo@...gle.com>
To: Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>, 
	Luigi Rizzo <rizzo.unipi@...il.com>, Paolo Abeni <pabeni@...hat.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, Sean Christopherson <seanjc@...gle.com>, 
	Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org, 
	Bjorn Helgaas <bhelgaas@...gle.com>, Willem de Bruijn <willemb@...gle.com>, 
	Luigi Rizzo <lrizzo@...gle.com>
Subject: [PATCH 3/6] genirq: soft_moderation: activate hooks in handle_irq_event()

Activate soft_moderation via the hooks in handle_irq_event()
and per-CPU and irq_desc initialization.

This change only implements fixed moderation. It needs to be
explicitly enabled at runtime on individual interrupts.

Example (kernel built with CONFIG_SOFT_IRQ_MODERATION=y)

  # enable fixed moderation
  echo "delay_us=400" > /proc/irq/soft_moderation

  # enable on network interrupts (change name as appropriate)
  echo on | tee /proc/irq/*/*eth*/../soft_moderation

  # show it works by looking at counters in /proc/irq/soft_moderation
  cat /proc/irq/soft_moderation

  # Show runtime impact on ping times changing delay_us
  ping -n -f -q -c 1000 ${some_nearby_host}
  echo "delay_us=100" > /proc/irq/soft_moderation
  ping -n -f -q -c 1000 ${some_nearby_host}

Configuration via module parameters (irq_moderation.${name}=${value}) or
echo "${name}=${value}" > /proc/irq/soft_moderation)

delay_us   0=off, range 1-500, default 100
  how long an interrupt is disabled after it fires. Small values are
  accumulated until they are large enough, e.g. 10us. As an example, a 2us value
  means that the timer is set only every 5 interrupts.

timer_rounds  0-20, default 0
  How many extra timer runs before re-enabling interrupts. This allows
  reducing the number of MSI interrupts while keeping delay_us small.
  This is similar to the "napi_defer_hard_irqs" option in NAPI, but with
  some subtle differences (e.g. here the number of rounds is
  deterministic, and interrupts are disabled at MSI level).

Change-Id: I47c5059ad537fcb9561f924620cf68e1d648aae6
---
 arch/x86/kernel/cpu/common.c | 1 +
 drivers/irqchip/irq-gic-v3.c | 2 ++
 kernel/irq/handle.c          | 3 +++
 kernel/irq/irqdesc.c         | 1 +
 4 files changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 02d97834a1d4d..1953419fde6ff 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2440,6 +2440,7 @@ void cpu_init(void)
 
 		intel_posted_msi_init();
 	}
+	irq_moderation_percpu_init();
 
 	mmgrab(&init_mm);
 	cur->active_mm = &init_mm;
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 3de351e66ee84..902bcbf9d85d8 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1226,6 +1226,8 @@ static void gic_cpu_sys_reg_init(void)
 		WARN_ON(gic_dist_security_disabled() != cpus_have_security_disabled);
 	}
 
+	irq_moderation_percpu_init();
+
 	/*
 	 * Some firmwares hand over to the kernel with the BPR changed from
 	 * its reset value (and with a value large enough to prevent
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index e103451243a0b..2cacceaaea9d0 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -12,6 +12,7 @@
 #include <linux/random.h>
 #include <linux/sched.h>
 #include <linux/interrupt.h>
+#include <linux/irq_moderation.h>
 #include <linux/kernel_stat.h>
 
 #include <asm/irq_regs.h>
@@ -254,9 +255,11 @@ irqreturn_t handle_irq_event(struct irq_desc *desc)
 	irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
 	raw_spin_unlock(&desc->lock);
 
+	irq_moderation_hook(desc); /* may disable irq so must run unlocked */
 	ret = handle_irq_event_percpu(desc);
 
 	raw_spin_lock(&desc->lock);
+	irq_moderation_epilogue(desc); /* start moderation timer if needed */
 	irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
 	return ret;
 }
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index db714d3014b5f..e3efbecf5b937 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -134,6 +134,7 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
 	desc->tot_count = 0;
 	desc->name = NULL;
 	desc->owner = owner;
+	irq_moderation_init_fields(desc);
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(desc->kstat_irqs, cpu) = (struct irqstat) { };
 	desc_smp_init(desc, node, affinity);
-- 
2.51.2.1041.gc1ab5b90ca-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ