[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx_nVKYMhCmC6BPNVxLfDaz=uoSsk1WOs-aX=M03Ew2qTA@mail.gmail.com>
Date: Wed, 13 Nov 2024 22:36:31 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Ingo Molnar <mingo@...hat.com>, "Peter Zijlstra (Intel)" <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Benjamin Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, LKML <linux-kernel@...r.kernel.org>,
wuyun.abel@...edance.com, youssefesmat@...omium.org,
Thomas Gleixner <tglx@...utronix.de>, efault@....de,
K Prateek Nayak <kprateek.nayak@....com>, John Stultz <jstultz@...gle.com>,
Vincent Palomares <paillon@...gle.com>
Subject: Re: Very high scheduling delay with plenty of idle CPUs
Ugh... just realized that for a few of the emails I've been replying
directly to one person instead of reply-all.
On Fri, Nov 8, 2024 at 1:02 AM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Fri, 8 Nov 2024 at 08:28, Saravana Kannan <saravanak@...gle.com> wrote:
> >
> > Hi scheduler folks,
> >
> > I'm running into some weird scheduling issues when testing non-sched
> > changes on a Pixel 6 that's running close to 6.12-rc5. I'm not sure if
> > this is an issue in earlier kernel versions or not.
> >
> > The async suspend/resume code calls async_schedule_dev_nocall() to
> > queue up a bunch of work. These queued up work seem to be running in
> > kworker threads.
> >
> > However, there have been many times where I see scheduling latency
> > (runnable, but not running) of 4.5 ms or higher for a kworker thread
> > when there are plenty of idle CPUs.
>
> You are using EAS, aren't you ?
> so the energy impact drive the cpu selection not cpu idleness
>
> There is a proposal to change feec to also take into account such case
> in addition to the energy impact
> https://lore.kernel.org/lkml/64ed0fb8-12ea-4452-9ec2-7ad127b65529@arm.com/T/
>
> I still have to finalize v2
Anyway, I tried this series (got it from
https://git.linaro.org/people/vincent.guittot/kernel.git/log/?h=sched/rework-eas)
and:
1. The timing hasn't improved at all compared to not having the series.
2. There's still a lot of preemption of runnable tasks with some empty CPUs.
For example:
https://ui.perfetto.dev/#!/?s=955ff7e73edf32dab27501025211fa2ce322f725
Thanks,
Saravana
>
> >
> > Does async_schedule_dev_nocall() have some weird limitations on where
> > they can be run? I know it has some NUMA related stuff, but the Pixel
> > 6 doesn't have NUMA. This oddity unnecessarily increases
> > suspend/resume latency as it adds up across kworker threads. So, I'd
> > appreciate any insights on what might be happening?
> >
> > If you know how to use perfetto (it's really pretty simple, all you
> > need to know is WASD and clicking), here's an example:
> > https://ui.perfetto.dev/#!/?s=e20045736e7dfa1e897db6489710061d2495be92
> >
> > Thanks,
> > Saravana
Powered by blists - more mailing lists