[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157A4B5C6726CDC369E0E39D4FA2@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 6 Jun 2024 14:34:00 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Thomas Gleixner <tglx@...utronix.de>, "kys@...rosoft.com"
<kys@...rosoft.com>, "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>, "decui@...rosoft.com"
<decui@...rosoft.com>, "mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de"
<bp@...en8.de>, "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>, "kw@...ux.com"
<kw@...ux.com>, "robh@...nel.org" <robh@...nel.org>, "bhelgaas@...gle.com"
<bhelgaas@...gle.com>, "James.Bottomley@...senPartnership.com"
<James.Bottomley@...senPartnership.com>, "martin.petersen@...cle.com"
<martin.petersen@...cle.com>, "arnd@...db.de" <arnd@...db.de>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
CC: "maz@...nel.org" <maz@...nel.org>, "den@...inux.co.jp"
<den@...inux.co.jp>, "jgowans@...zon.com" <jgowans@...zon.com>,
"dawei.li@...ngroup.cn" <dawei.li@...ngroup.cn>
Subject: RE: [RFC 06/12] genirq: Add per-cpu flow handler with conditional IRQ
stats
From: Thomas Gleixner <tglx@...utronix.de> Sent: Thursday, June 6, 2024 2:34 AM
>
> On Thu, Jun 06 2024 at 03:14, Michael Kelley wrote:
> > From: Thomas Gleixner <tglx@...utronix.de> Sent: Wednesday, June 5, 2024 7:20 AM
> >>
> >> On Wed, Jun 05 2024 at 13:45, Michael Kelley wrote:
> >> > From: Thomas Gleixner <tglx@...utronix.de> Sent: Wednesday, June 5, 2024 6:20 AM
> >> >
> >> > In /proc/interrupts, the double-counting isn't a problem, and is
> >> > potentially helpful as you say. But /proc/stat, for example, shows a total
> >> > interrupt count, which will be roughly double what it was before. That
> >> > /proc/stat value then shows up in user space in vmstat, for example.
> >> > That's what I was concerned about, though it's not a huge problem in
> >> > the grand scheme of things.
> >>
> >> That's trivial to solve. We can mark interrupts to be excluded from
> >> /proc/stat accounting.
> >>
> >
> > OK. On x86, some simple #ifdef'ery in arch_irq_stat_cpu() can filter
> > out the HYP interrupts. But what do you envision on arm64, where
> > there is no arch_irq_stat_cpu()? On arm64, the top-level interrupt is a
> > normal Linux IRQ, and its count is included in the "kstat.irqs_sum" field
> > with no breakout by IRQ. Identifying the right IRQ and subtracting it
> > out later looks a lot uglier than the conditional stats accounting.
>
> Sure. There are two ways to solve that:
>
> 1) Introduce a IRQ_NO_PER_CPU_STATS flag, mark the interrupt
> accordingly and make the stats increment conditional on it.
> The downside is that the conditional affects every interrupt.
>
> 2) Do something like this:
>
> static inline
> void __handle_percpu_irq(struct irq_desc *desc, irqreturn_t (*handle)(struct irq_desc
> *))
> {
> struct irq_chip *chip = irq_desc_get_chip(desc);
>
> if (chip->irq_ack)
> chip->irq_ack(&desc->irq_data);
>
> handle(desc);
>
> if (chip->irq_eoi)
> chip->irq_eoi(&desc->irq_data);
> }
>
> void handle_percpu_irq(struct irq_desc *desc)
> {
> /*
> * PER CPU interrupts are not serialized. Do not touch
> * desc->tot_count.
> */
> __kstat_incr_irqs_this_cpu(desc);
> __handle_percpu_irq(desc, handle_irq_event_percpu);
> }
>
> void handle_percpu_irq_nostat(struct irq_desc *desc)
> {
> __this_cpu_inc(desc->kstat_irqs->cnt);
> __handle_percpu_irq(desc, __handle_irq_event_percpu);
> }
>
> So that keeps the interrupt accounted for in /proc/interrupts. If you
> don't want that remove the __this_cpu_inc() and mark the interrupt with
> irq_set_status_flags(irq, IRQ_HIDDEN). That will exclude it from
> /proc/interrupts too.
>
Yes, this works for not double-counting in the first place. Account for the
control message interrupts in their own Linux IRQ. Then for the top-level
interrupt, instead of adding a new handler with conditional accounting,
add a new per-CPU handler that does no accounting. I had not noticed
the IRQ_HIDDEN flag, and that solves my concern about having an
entry in /proc/interrupts that always shows zero interrupts. And with
no double-counting, the interrupt counts in /proc/stat won't be bloated.
On x86, I'll have to separately make the "HYP" line in /proc/interrupts
go away, but that's easy.
Thanks,
Michael
Powered by blists - more mailing lists