linux-kernel - Re: [PATCH 1/1] sched/rt: avoid contend with CFS task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190830145501.zadfv2ffuu7j46ft@e107158-lin.cambridge.arm.com>
Date:   Fri, 30 Aug 2019 15:55:02 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Jing-Ting Wu <jing-ting.wu@...iatek.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Matthias Brugger <matthias.bgg@...il.com>,
        wsd_upstream@...iatek.com, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        linux-mediatek@...ts.infradead.org
Subject: Re: [PATCH 1/1] sched/rt: avoid contend with CFS task

On 08/29/19 11:38, Valentin Schneider wrote:
> On 29/08/2019 04:15, Jing-Ting Wu wrote:
> > At original linux design, RT & CFS scheduler are independent.
> > Current RT task placement policy will select the first cpu in
> > lowest_mask, even if the first CPU is running a CFS task.
> > This may put RT task to a running cpu and let CFS task runnable.
> > 
> > So we select idle cpu in lowest_mask first to avoid preempting
> > CFS task.
> > 
> 
> Regarding the RT & CFS thing, that's working as intended. RT is a whole
> class above CFS, it shouldn't have to worry about CFS.
> 
> On the other side of things, CFS does worry about RT. We have the concept
> of RT-pressure in the CFS scheduler, where RT tasks will reduce a CPU's
> capacity (see fair.c::scale_rt_capacity()).
> 
> CPU capacity is looked at on CFS wakeup (see wake_cap() and
> find_idlest_cpu()), and the periodic load balancer tries to spread load
> over capacity, so it'll tend to put less things on CPUs that are also
> running RT tasks.
> 
> If RT were to start avoiding rqs with CFS tasks, we'd end up with a nasty
> situation were both are avoiding each other. It's even more striking when
> you see that RT pressure is done with a rq-wide RT util_avg, which
> *doesn't* get migrated when a RT task migrates. So if you decide to move
> a RT task to an idle CPU "B" because CPU "A" had runnable CFS tasks, the
> CFS scheduler will keep seeing CPU "B" as not significantly RT-pressured
> while that util_avg signal ramps up, whereas it would correctly see CPU
> "A" as RT-pressured if the RT task previously ran there.
> 
> So overall I think this is the wrong approach.

I like the idea, but yeah tend to agree the current approach might not be
enough.

I think the major problem here is that on generic systems where CFS is a first
class citizen, RT tasks can be hostile to them - not always necessarily for a
good reason.

To further complicate the matter, even among CFS tasks we can't tell which are
more important than the others - though hopefully latency-nice proposal will
make the situation better.

So I agree we have a problem here, but I think this patch is just a temporary
band aid and we need to do better. Though I have no concrete suggestion yet on
how to do that.

Another thing I couldn't quantify yet how common and how severe this problem is
yet. Jing-Ting, if you can share the details of your use case that'd be great.

Cheers

--
Qais Yousef