[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190830145501.zadfv2ffuu7j46ft@e107158-lin.cambridge.arm.com>
Date: Fri, 30 Aug 2019 15:55:02 +0100
From: Qais Yousef <qais.yousef@....com>
To: Valentin Schneider <valentin.schneider@....com>
Cc: Jing-Ting Wu <jing-ting.wu@...iatek.com>,
Peter Zijlstra <peterz@...radead.org>,
Matthias Brugger <matthias.bgg@...il.com>,
wsd_upstream@...iatek.com, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-mediatek@...ts.infradead.org
Subject: Re: [PATCH 1/1] sched/rt: avoid contend with CFS task
On 08/29/19 11:38, Valentin Schneider wrote:
> On 29/08/2019 04:15, Jing-Ting Wu wrote:
> > At original linux design, RT & CFS scheduler are independent.
> > Current RT task placement policy will select the first cpu in
> > lowest_mask, even if the first CPU is running a CFS task.
> > This may put RT task to a running cpu and let CFS task runnable.
> >
> > So we select idle cpu in lowest_mask first to avoid preempting
> > CFS task.
> >
>
> Regarding the RT & CFS thing, that's working as intended. RT is a whole
> class above CFS, it shouldn't have to worry about CFS.
>
> On the other side of things, CFS does worry about RT. We have the concept
> of RT-pressure in the CFS scheduler, where RT tasks will reduce a CPU's
> capacity (see fair.c::scale_rt_capacity()).
>
> CPU capacity is looked at on CFS wakeup (see wake_cap() and
> find_idlest_cpu()), and the periodic load balancer tries to spread load
> over capacity, so it'll tend to put less things on CPUs that are also
> running RT tasks.
>
> If RT were to start avoiding rqs with CFS tasks, we'd end up with a nasty
> situation were both are avoiding each other. It's even more striking when
> you see that RT pressure is done with a rq-wide RT util_avg, which
> *doesn't* get migrated when a RT task migrates. So if you decide to move
> a RT task to an idle CPU "B" because CPU "A" had runnable CFS tasks, the
> CFS scheduler will keep seeing CPU "B" as not significantly RT-pressured
> while that util_avg signal ramps up, whereas it would correctly see CPU
> "A" as RT-pressured if the RT task previously ran there.
>
> So overall I think this is the wrong approach.
I like the idea, but yeah tend to agree the current approach might not be
enough.
I think the major problem here is that on generic systems where CFS is a first
class citizen, RT tasks can be hostile to them - not always necessarily for a
good reason.
To further complicate the matter, even among CFS tasks we can't tell which are
more important than the others - though hopefully latency-nice proposal will
make the situation better.
So I agree we have a problem here, but I think this patch is just a temporary
band aid and we need to do better. Though I have no concrete suggestion yet on
how to do that.
Another thing I couldn't quantify yet how common and how severe this problem is
yet. Jing-Ting, if you can share the details of your use case that'd be great.
Cheers
--
Qais Yousef
Powered by blists - more mailing lists