[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c975578.a64f.1932a950632.Coremail.00107082@163.com>
Date: Thu, 14 Nov 2024 20:10:29 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Thomas Gleixner" <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/13] kernel/irq/proc: use seq_put_decimal_ull_width()
for decimal values
Hi,
At 2024-11-14 03:10:08, "Thomas Gleixner" <tglx@...utronix.de> wrote:
>On Sat, Nov 09 2024 at 00:07, David Wang wrote:
>> The improvement has pratical significance, considering many monitoring
>> tools would read /proc/interrupts periodically.
>
>I've applied this, but ...
>
>looking at a 256 CPU machine. /proc/interrupts provides data for 560
>interrupts, which amounts to ~1.6MB data size.
>
>There are 560 * 256 = 143360 interrupt count fields. 140615 of these
>fields are zero, which means 140615 * 11 bytes. That's 96% of the
>overall data size. The actually useful information is less than
>50KB if properly condensed.
>
>I'm really amused that people spend a lot of time to improve the
>performance of /proc/interrupts instead of actually sitting down and
>implementing a proper new interface for this purpose, which would make
>both the kernel and the tools faster by probably several orders of
>magnitude.
That's a great idea~.
I tried to make changes and verify the performance, the result is good,
only that kernel side improvement is not that big, but still significant.
Here is what I did, (draft codes are at the end):
Created two new /proc entry for comparision:
1. /proc/interruptsp, non-zero value only, arch-independent irqs without description
$ cat /proc/interruptsp
IRQ CPU counter # positive only
0 0 40
38 5 23
39 0 81
40 1 6
41 2 111
...
$ cat /proc/interruptsp | wc
18 57 181
$ strace -e read -T cat /proc/interruptsp > /dev/null
...
read(3, "IRQ CPU counter # positive only\n"..., 131072) = 181 <0.000144>
read(3, "", 131072) = 0 <0.000009>
$ time ./interruptsp # 1 million rounds of open/read(all)/close;
real 1m54.727s
user 0m0.368s
sys 1m54.309s
2. /proc/interruptso, same with old format, except arch-dependent irqs are removed
$ cat /proc/interruptso | wc
32 388 4439
$ strace -e read -T cat /proc/interruptso > /dev/null
...
read(3, " CPU0 CPU1 "..., 131072) = 4005 <0.000071>
read(3, " 88: 0 0 "..., 131072) = 434 <0.000111>
read(3, "", 131072) = 0 <0.000009>
$ time ./interruptso # 1 million rounds of open/read(all)/close;
real 2m19.284s
user 0m0.400s
sys 2m18.756s
The size is indeed tens of times shorter, and would have huge improvement
for those applications parsing the whole content; But as for kernel side
improvement, strace and stress test indicates the improvement is not
that huge, well, but still significant ~40%.
The bottleneck seems to be mtree_load called by irq_to_desc, based on a simple
profiling (not sure whether this is expected or not):
show_interruptsp(74.724% 845541/1131554)
mtree_load(56.596% 478544/845541)
__rcu_read_unlock(5.914% 50004/845541)
__rcu_read_lock(3.056% 25840/845541)
irq_to_desc(2.151% 18184/845541)
seq_put_decimal_ull_width(1.211% 10243/845541)
...
I think the improvement worth pursuing. Maybe a new interface for "active"
interrupts, say /proc/activeinterrupts?, and the old /proc/interrupts can
serve as a table for available ids/cpus/descriptions.
Do you plan to work on this? If not, I can take time on it.
Draft codes:
int show_interruptsp(struct seq_file *p, void *v)
{
int i = *(loff_t *) v, j;
struct irq_desc *desc;
if (i >= ACTUAL_NR_IRQS)
return 0;
/* print header and calculate the width of the first column */
if (i == 0) {
seq_puts(p, "IRQ CPU counter # positive only\n");
}
rcu_read_lock();
desc = irq_to_desc(i);
if (!desc || irq_settings_is_hidden(desc))
goto outsparse;
if (!desc->action || irq_desc_is_chained(desc) || !desc->kstat_irqs)
goto outsparse;
for_each_online_cpu(j) {
unsigned int cnt = desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, j) : 0;
if (cnt > 0) {
seq_put_decimal_ull(p, "", i);
seq_put_decimal_ull(p, " ", j);
seq_put_decimal_ull(p, " ", cnt);
seq_putc(p, '\n');
}
}
outsparse:
rcu_read_unlock();
return 0;
}
int show_interruptso(struct seq_file *p, void *v)
{
static int prec;
int i = *(loff_t *) v, j;
struct irqaction *action;
struct irq_desc *desc;
unsigned long flags;
if (i >= ACTUAL_NR_IRQS) <<---return when arch-independent irqs are done.
return 0;
...
Thanks~
David
Powered by blists - more mailing lists