[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87bjl06yij.ffs@tglx>
Date: Tue, 18 Nov 2025 00:16:20 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Luigi Rizzo <lrizzo@...gle.com>, Marc Zyngier <maz@...nel.org>, Luigi
Rizzo <rizzo.unipi@...il.com>, Paolo Abeni <pabeni@...hat.com>, Andrew
Morton <akpm@...ux-foundation.org>, Sean Christopherson
<seanjc@...gle.com>, Jacob Pan <jacob.jun.pan@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org, Bjorn Helgaas
<bhelgaas@...gle.com>, Willem de Bruijn <willemb@...gle.com>, Luigi Rizzo
<lrizzo@...gle.com>
Subject: Re: [PATCH v2 3/8] genirq: soft_moderation: implement fixed moderation
On Mon, Nov 17 2025 at 20:30, Thomas Gleixner wrote:
> On Sun, Nov 16 2025 at 18:28, Luigi Rizzo wrote:
>> + ms->rounds_left--;
>> +
>> + if (ms->rounds_left > 0) {
>> + /* Timer still alive, just call the handlers. */
>> + list_for_each_entry_safe(desc, next, &ms->descs, mod.ms_node) {
>> + ms->irq_count += irq_mod_info.count_timer_calls;
I missed this gem before. How is this supposed to calculate an interrupt
rate when count_timer_calls is disabled?
Yet another magic knob to tweak something which works by chance and not
by design.
TBH. This whole thing should be put into the 'ugly code museum' for
educational purposes and deterrence. It wants to be rewritten from
scratch with a proper design and a structured understandable approach.
This polish the Google PoC hackery to death will go nowhere. It's just a
ginormous waste of time. Not that I care about the time you waste with
that, but I pretty much care about mine.
That said, start over from scratch and take the feedback into account so
you can address the substantial problems I pointed out (CPU hotplug,
concurrency, life time management, state consistency, affinity changes)
in the design and not after the fact.
First of all please find some other wording than moderation. That's just
a randomly diced word without real meaning. Pick something which
describes what this infrastructure actually does: Adaptive polling, no?
There are a couple of other fundamental questions to answer upfront:
1) Is this throttle everything on a CPU the proper approach?
To me this does not make sense. The CPU hogging network adapter or
disk drive has no business to delay low frequency interrupts,
which might be important, just because.
Making this a per interrupt line property allows to solve a few
other issues trivially like the integration into that posted MSI
muck.
It also reduces the amount of descriptors to be polled in the
timer interrupt.
2) Shouldn't the interrupt source be masked at the device level once
an interrupt is switched into polling mode?
Masking it at the device level (without touching disabled state)
is definitely a sensible thing to do. It keeps state consistent
and again allows trivial integration of that posted MSI stuff
without insane hacks all over the place.
3) Does a pure interrupt rate based scheme make sense?
Definitely not in the way it's implemented. Why?
Simply because once you switched to polling mode there is no real
information anymore as you fail to take the return value of the
handler into account. So unless your magic knob is 0 every polled
interrupt is accounted for whether it actually handles an
interrupt or not.
But if your magic knob is 0 then this purely relies on irqtime
accounting, which includes the timer interrupt as an accumulative
measure.
IOW, "works" by some definition of works after adding enough magic
knobs to make it "work" under certain circumstances. "Works for
Google" is not a good argument.
That's unmaintainable and unusable. No amount of magic command
line examples will fix that because the state space of your knobs
is way too big to be useful and comprehensible.
Add all the questions which pop up when you really sit down and do a
proper from scratch design of this.
Thanks,
tglx
Powered by blists - more mailing lists