[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201021152637.GH32041@suse.de>
Date: Wed, 21 Oct 2020 16:26:37 +0100
From: Mel Gorman <mgorman@...e.de>
To: Julia Lawall <julia.lawall@...ia.fr>
Cc: Vincent Guittot <vincent.guittot@...aro.org>,
Ingo Molnar <mingo@...hat.com>,
kernel-janitors@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
linux-kernel@...r.kernel.org,
Valentin Schneider <valentin.schneider@....com>,
Gilles.Muller@...ia.fr
Subject: Re: [PATCH] sched/fair: check for idle core
On Wed, Oct 21, 2020 at 03:48:59PM +0200, Julia Lawall wrote:
> > I worry it's overkill because prev is always used if it is idle even
> > if it is on a node remote to the waker. It cuts off the option of a
> > wakee moving to a CPU local to the waker which is not equivalent to the
> > original behaviour.
>
> Could it be possible to check p->recent_used_cpu? If that is prev (or on
> the same socket?), then prev could be a good choice. If that is on the
> same socket as the waker, then maybe the waker would be better.
>
It is an interesting idea but the treatment of p->recent_used_cpu is a
bit strange though. At one point, I looked at reconciling all the select
CPU paths into one single pass (prototype worked back in 5.6-era but was
not a universal win). As part of that, I remembered that the setting
of recent_used_cpu is a problem because it can be set to the wakers
recent_used_cpu by this chunk
if (want_affine)
current->recent_used_cpu = cpu;
Fixing that on its own causes other problems. From an unposted series
that "fixed it" as the last part of 14 patch series;
After select_idle_sibling, the *wakers* recent_used_cpu is set to the
CPU used for the wakee. This was necessary at the time as without it,
the miss rate was unacceptably high. It still works, but it is less
efficient than it can be.
This patch leaves the waker state alone and uses either the previous CPU on
a hit or the target CPU on a miss when the recently used CPU is examined
for idleness. The rest of the series is important as without it, this
patch improves the number of hits but the miss rate gets ridiculously high.
After the rest of the series, the hit and miss rates are both higher but
the miss rate is acceptable.
As the hunk means that recent_used_cpu could have been set based
on a previous wakeup, it makes it unreliable for making cross-node
decisions. p->recent_used_cpu's primary purpose at the moment is to
avoid select_idle_sibling searches.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists