lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200203145752.sqxpqse6pav4nxxv@e107158-lin>
Date:   Mon, 3 Feb 2020 14:57:53 +0000
From:   Qais Yousef <qais.yousef@....com>
To:     Pavan Kondeti <pkondeti@...eaurora.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] sched: rt: Make RT capacity aware

On 02/03/20 11:02, Pavan Kondeti wrote:
> Full trace is attached. Copy/pasting the snippet where it shows packing is
> happening. The custom trace_printk are added in cpupri_find() before calling
> fitness_fn(). As you can see pid=535 is woken on CPU#7 where pid=538 RT task
> is runnning. Right after waking, the push is tried but it did not work either.
> 
> This is not a serious problem for us since we don't set RT tasks
> uclamp.min=1024 . However, it changed the behavior and might introduce latency
> for RT tasks on b.L platforms running the upstream kernel as is.
> 
>         big-task-538   [007] d.h.   403.401633: irq_handler_entry: irq=3 name=arch_timer
>         big-task-538   [007] d.h2   403.401633: sched_waking: comm=big-task pid=535 prio=89 target_cpu=007
>         big-task-538   [007] d.h2   403.401635: cpupri_find: before task=big-task-535 lowest_mask=f
>         big-task-538   [007] d.h2   403.401636: cpupri_find: after task=big-task-535 lowest_mask=0
>         big-task-538   [007] d.h2   403.401637: cpupri_find: it comes here
>         big-task-538   [007] d.h2   403.401638: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
>         big-task-538   [007] d.h3   403.401640: sched_wakeup: comm=big-task pid=535 prio=89 target_cpu=007
>         big-task-538   [007] d.h3   403.401641: cpupri_find: before task=big-task-535 lowest_mask=f
>         big-task-538   [007] d.h3   403.401642: cpupri_find: after task=big-task-535 lowest_mask=0
>         big-task-538   [007] d.h3   403.401642: cpupri_find: it comes here
>         big-task-538   [007] d.h3   403.401643: find_lowest_rq: task=big-task-535 ret=0 lowest_mask=0
>         big-task-538   [007] d.h.   403.401644: irq_handler_exit: irq=3 ret=handled
>         big-task-538   [007] d..2   403.402413: sched_switch: prev_comm=big-task prev_pid=538 prev_prio=89 prev_state=S ==> next_comm=big-task next_pid=535 next_prio=89

Thanks for that.

If I read the trace correctly, the 5 tasks end up all being on the *all* the
big cores (ie: sharing), correct?

The results I posted did show that we can end up with 2 tasks on a single big
core. I don't think we can say this is a good or a bad thing, though for me
I see it a good thing since it honored a request to be on the big core which
the system tried its best to provide.

Maybe we do want to cater for a default all boosted RT tasks, is this what
you're saying? If yes, how do you propose the logic to look like? My thought is
to provide a real time knob to tune down most RT tasks to sensible default
dependong on how powerful (and power hungry) the system is, then use the per
task API to boost the few tasks that really need more performance out of the
system.

Note from my results assuming I didn't do anything stupid, if you boot with
a system that runs with rt_task->uclamp_min = 1024, then some will race to the
big cores and the rest will stay where they are on the littles.

In my first version things looked slightly different and I think handling of
the fallback not finding a fitting CPU was better.

Please have a look at and let me know what you think.

https://lore.kernel.org/lkml/20190903103329.24961-1-qais.yousef@arm.com/

Thanks

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ