lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALOAHbBQXSuUmz8C2CKA2o-Menup8uz3qOX34JsZCCG68GhaWg@mail.gmail.com>
Date: Wed, 28 May 2025 10:10:51 +0800
From: Yafang Shao <laoar.shao@...il.com>
To: Michal Koutný <mkoutny@...e.com>
Cc: mingo@...hat.com, peterz@...radead.org, hannes@...xchg.org, 
	juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com, 
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com, 
	surenb@...gle.com, linux-kernel@...r.kernel.org, cgroups@...r.kernel.org, 
	lkp@...el.com
Subject: Re: [PATCH v9 1/2] sched: Fix cgroup irq time for CONFIG_IRQ_TIME_ACCOUNTING

On Tue, May 27, 2025 at 11:33 PM Michal Koutný <mkoutny@...e.com> wrote:
>
> Hello.
>
> On Sun, May 11, 2025 at 11:07:59AM +0800, Yafang Shao <laoar.shao@...il.com> wrote:
> > The CPU usage of the cgroup is relatively low at around 55%, but this usage
> > doesn't increase, even with more netperf tasks. The reason is that CPU0 is
> > at 100% utilization, as confirmed by mpstat:
> >
> >   02:56:22 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> >   02:56:23 PM    0    0.99    0.00   55.45    0.00    0.99   42.57    0.00    0.00    0.00    0.00
> >
> >   02:56:23 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> >   02:56:24 PM    0    2.00    0.00   55.00    0.00    0.00   43.00    0.00    0.00    0.00    0.00
> >
> > It is clear that the %soft is excluded in the cgroup of the interrupted
> > task. This behavior is unexpected. We should include IRQ time in the
> > cgroup to reflect the pressure the group is under.
>
> I think this would go against intention of CONFIG_IRQ_TIME_ACCOUNTING
> (someony more familiar may chime in).

Please refer to the discussion with Ingo :
https://lore.kernel.org/all/aBsGXCKX8-2_Cn9x@gmail.com/

>
> > After a thorough analysis, I discovered that this change in behavior is due
> > to commit 305e6835e055 ("sched: Do not account irq time to current task"),
> > which altered whether IRQ time should be charged to the interrupted task.
> > While I agree that a task should not be penalized by random interrupts, the
> > task itself cannot progress while interrupted. Therefore, the interrupted
> > time should be reported to the user.
> >
> > The system metric in cpuacct.stat is crucial in indicating whether a
> > container is under heavy system pressure, including IRQ/softirq activity.
> > Hence, IRQ/softirq time should be included in the cpuacct system usage,
> > which also applies to cgroup2’s rstat.
>
> So I guess, it'd be better to add a separate entry in cpu.stat with
> irq_usec (instead of bundling it into system_usec in spite of
> CONFIG_IRQ_TIME_ACCOUNTING).
>
> I admit, I'd be happier if irq.pressure values could be used for
> that. Maybe not the PSI ratio itself but irq.pressure:total should be
> that amount. WDYT?

Thank you for your suggestion. Both methods can effectively retrieve
the container’s IRQ usage. However, I prefer adding a new entry
irq_usec to cpu.stat since it aligns better with CPU utilization
metrics.


--
Regards
Yafang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ