linux-kernel - Re: [PATCH v2] sched: rt: Make RT capacity aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20200203145752.sqxpqse6pav4nxxv@e107158-lin>
Date:   Mon, 3 Feb 2020 14:57:53 +0000
From:   Qais Yousef <qais.yousef@....com>
To:     Pavan Kondeti <pkondeti@...eaurora.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] sched: rt: Make RT capacity aware

On 02/03/20 11:02, Pavan Kondeti wrote:
> Full trace is attached. Copy/pasting the snippet where it shows packing is
> happening. The custom trace_printk are added in cpupri_find() before calling
> fitness_fn(). As you can see pid=535 is woken on CPU#7 where pid=538 RT task
> is runnning. Right after waking, the push is tried but it did not work either.
> 
> This is not a serious problem for us since we don't set RT tasks
> uclamp.min=1024 . However, it changed the behavior and might introduce latency
> for RT tasks on b.L platforms running the upstream kernel as is.
> 
>         big-task-538   [007] d.h.   403.401633: irq_handler_entry: irq=3 name=arch_timer
>         big-task-538   [007] d.h2   403.401633: sched_waking: comm=big-task pid=535 prio=89 target_cpu=007
>         big-task-538   [007] d.h2   403.401635: cpupri_find: before task=big-task-535 lowest_mask=f
>         big-task-538   [007] d.h2   403.401636: cpupri_find: after task=big-task-535 lowest_mask=0
>         big-task-538   [007] d.h2   403.401637: cpupri_find: it comes here
>         big-task-538   [007] d.h2   403.401638: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
>         big-task-538   [007] d.h3   403.401640: sched_wakeup: comm=big-task pid=535 prio=89 target_cpu=007
>         big-task-538   [007] d.h3   403.401641: cpupri_find: before task=big-task-535 lowest_mask=f
>         big-task-538   [007] d.h3   403.401642: cpupri_find: after task=big-task-535 lowest_mask=0
>         big-task-538   [007] d.h3   403.401642: cpupri_find: it comes here
>         big-task-538   [007] d.h3   403.401643: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
>         big-task-538   [007] d.h.   403.401644: irq_handler_exit: irq=3 ret=handled
>         big-task-538   [007] d..2   403.402413: sched_switch: prev_comm=big-task prev_pid=538 prev_prio=89 prev_state=S ==> next_comm=big-task next_pid=535 next_prio=89

Thanks for that.

If I read the trace correctly, the 5 tasks end up all being on the *all* the
big cores (ie: sharing), correct?

The results I posted did show that we can end up with 2 tasks on a single big
core. I don't think we can say this is a good or a bad thing, though for me
I see it a good thing since it honored a request to be on the big core which
the system tried its best to provide.

Maybe we do want to cater for a default all boosted RT tasks, is this what
you're saying? If yes, how do you propose the logic to look like? My thought is
to provide a real time knob to tune down most RT tasks to sensible default
dependong on how powerful (and power hungry) the system is, then use the per
task API to boost the few tasks that really need more performance out of the
system.

Note from my results assuming I didn't do anything stupid, if you boot with
a system that runs with rt_task->uclamp_min = 1024, then some will race to the
big cores and the rest will stay where they are on the littles.

In my first version things looked slightly different and I think handling of
the fallback not finding a fitting CPU was better.

Please have a look at and let me know what you think.

https://lore.kernel.org/lkml/20190903103329.24961-1-qais.yousef@arm.com/

Thanks

--
Qais Yousef