[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALOAHbAXyet_ip62suwgz3GdkfFQo29n6UoDav+XON4zkTUfMw@mail.gmail.com>
Date: Fri, 22 Nov 2024 11:17:56 +0800
From: Yafang Shao <laoar.shao@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Michal Koutný <mkoutny@...e.com>, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
hannes@...xchg.org, surenb@...gle.com, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/4] sched: Fix missing irq time when
CONFIG_IRQ_TIME_ACCOUNTING is enabled
On Mon, Nov 18, 2024 at 6:10 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Sun, Nov 17, 2024 at 10:56:21AM +0800, Yafang Shao wrote:
> > On Fri, Nov 15, 2024 at 9:41 PM Michal Koutný <mkoutny@...e.com> wrote:
>
> > > > The load balancer is malfunctioning due to the exclusion of IRQ time from
> > > > CPU utilization calculations.
> > >
> > > Could this be fixed by subtracting (global) IRQ time from (presumed
> > > total) system capacity that the balancer uses for its decisions? (i.e.
> > > without exact per-cgroup breakdown of IRQ time)
> >
> > The issue here is that the global IRQ time may include the interrupted
> > time of tasks outside the target cgroup. As a result, I don't believe
> > it's possible to find a reliable solution without modifying the
> > kernel.
>
> Since there is no relation between the interrupt and the interrupted
> task (and through that its cgroup) -- all time might or might not be
> part of your cgroup of interest. Consider it a random distribution if
> you will.
Some points require further clarification.
On our servers, the majority of IRQ/softIRQ activity originates from
network traffic, and we consistently enable Receive Flow Steering
(RFS) [0]. This configuration ensures that softIRQs are more likely to
interrupt the tasks responsible for processing the corresponding
packets. As a result, the distribution of softIRQs is not random but
instead closely aligned with the packet-handling tasks.
[0]. https://lwn.net/Articles/381955/
Powered by blists - more mailing lists