lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241101105426.GX14555@noisy.programming.kicks-ass.net>
Date: Fri, 1 Nov 2024 11:54:26 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Yafang Shao <laoar.shao@...il.com>
Cc: mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, hannes@...xchg.org,
	surenb@...gle.com, cgroups@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/4] sched: Fix irq accounting for
 CONFIG_IRQ_TIME_ACCOUNTING

On Fri, Nov 01, 2024 at 11:17:46AM +0800, Yafang Shao wrote:
> After enabling CONFIG_IRQ_TIME_ACCOUNTING to track IRQ pressure in our
> container environment, we encountered several user-visible behavioral
> changes:
> 
> - Interrupted IRQ/softirq time is not accounted for in the cpuacct cgroup
> 
>   This breaks userspace applications that rely on CPU usage data from
>   cgroups to monitor CPU pressure. This patchset resolves the issue by
>   ensuring that IRQ/softirq time is accounted for in the cgroup of the
>   interrupted tasks.
> 
> - getrusage(2) does not include time interrupted by IRQ/softirq
> 
>   Some services use getrusage(2) to check if workloads are experiencing CPU
>   pressure. Since IRQ/softirq time is no longer charged to task runtime,
>   getrusage(2) can no longer reflect the CPU pressure caused by heavy
>   interrupts.
> 
> This patchset addresses the first issue, which is relatively
> straightforward. 

So I don't think it is. I think they're both the same issue. You cannot
know for whom the work done by the (soft) interrupt is.

For instance, if you were to create 2 cgroups, and have one cgroup do a
while(1) loop, while you'd have that other cgroup do your netperf
workload, I suspect you'll see significant (soft)irq load on the
while(1) cgroup, even though it's guaranteed to not be from it.

Same with rusage -- rusage is fully task centric, and the work done by
(soft) irqs are not necessarily	related to the task they interrupt.


So while you're trying to make the world conform to your legacy
monitoring view, perhaps you should fix your view of things.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ