linux-kernel - Re: [PATCH v5 02/10] sched/rt: add rt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180530110117.GJ30654@e110439-lin>
Date:   Wed, 30 May 2018 12:01:17 +0100
From:   Patrick Bellasi <patrick.bellasi@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        viresh kumar <viresh.kumar@...aro.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Quentin Perret <quentin.perret@....com>
Subject: Re: [PATCH v5 02/10] sched/rt: add rt_rq utilization tracking

On 30-May 12:06, Vincent Guittot wrote:
> On 30 May 2018 at 11:32, Patrick Bellasi <patrick.bellasi@....com> wrote:
> > On 29-May 15:29, Vincent Guittot wrote:

[...]

> >> >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> >> >> index ef3c4e6..b4148a9 100644
> >> >> --- a/kernel/sched/rt.c
> >> >> +++ b/kernel/sched/rt.c
> >> >> @@ -5,6 +5,8 @@
> >> >>   */
> >> >>  #include "sched.h"
> >> >>
> >> >> +#include "pelt.h"
> >> >> +
> >> >>  int sched_rr_timeslice = RR_TIMESLICE;
> >> >>  int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE;
> >> >>
> >> >> @@ -1572,6 +1574,9 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> >> >>
> >> >>       rt_queue_push_tasks(rq);
> >> >>
> >> >> +     update_rt_rq_load_avg(rq_clock_task(rq), rq,
> >> >> +             rq->curr->sched_class == &rt_sched_class);
> >> >> +
> >> >>       return p;
> >> >>  }
> >> >>
> >> >> @@ -1579,6 +1584,8 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
> >> >>  {
> >> >>       update_curr_rt(rq);
> >> >>
> >> >> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
> >> >> +
> >> >>       /*
> >> >>        * The previous task needs to be made eligible for pushing
> >> >>        * if it is still active
> >> >> @@ -2308,6 +2315,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued)
> >> >>       struct sched_rt_entity *rt_se = &p->rt;
> >> >>
> >> >>       update_curr_rt(rq);
> >> >> +     update_rt_rq_load_avg(rq_clock_task(rq), rq, 1);
> >> >
> >> > Mmm... not entirely sure... can't we fold
> >> >    update_rt_rq_load_avg() into update_curr_rt() ?
> >> >
> >> > Currently update_curr_rt() is used in:
> >> >    dequeue_task_rt
> >> >    pick_next_task_rt
> >> >    put_prev_task_rt
> >> >    task_tick_rt
> >> >
> >> > while we update_rt_rq_load_avg() only in:
> >> >    pick_next_task_rt
> >> >    put_prev_task_rt
> >> >    task_tick_rt
> >> > and
> >> >    update_blocked_averages
> >> >
> >> > Why we don't we need to update at dequeue_task_rt() time ?
> >>
> >> We are tracking rt rq and not sched entities so we want to know when
> >> sched rt will be the running or not  sched class on the rq. Tracking
> >> dequeue_task_rt is useless
> >
> > What about (push) migrations?
> 
> it doesn't make any difference. put_prev_task_rt() says that the prev
> task that was running, was a rt task so we can account past time at rt
> running time
> and pick_next_task_rt says that the next one will be a rt task so we
> have to account elapse time either to rt or not rt time according.

Right, I was missing that you are tracking RT (and DL) only at RQ
level... not SE level, thus we will not see migrations of blocked
utilization.

> I can probably optimize the pick_next_task_rt by doing the below instead:
> 
> if (rq->curr->sched_class == &rt_sched_class)
>        update_rt_rq_load_avg(rq_clock_task(rq), rq, 0);
> 
> If prev task is a rt  task, put_prev_task_rt has already done the update

Right.

Just one more question about non tracking SE. Once we migrate an RT
task with the current solution we will have to wait for it's PELT
blocked utilization to decay completely before starting to ignore that
task contribution, which means that:
 1. we will see an higher utilization on the original CPU
 2. we don't immediately see the increased utilization on the
    destination CPU

I remember Juri had some patches to track SE utilization thus fixing
the two issues above. Can you remember me why we decided to go just
for the RQ tracking solution?
Don't we expect any strange behaviors on real systems when RT tasks
are moved around?

Perhaps we should run some tests on Android...

-- 
#include <best/regards.h>

Patrick Bellasi