lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 14 Oct 2022 15:32:56 -0700
From:   "Connor O'Brien" <connoro@...gle.com>
To:     Valentin Schneider <vschneid@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kernel-team@...roid.com,
        John Stultz <jstultz@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Qais Yousef <qais.yousef@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Will Deacon <will@...nel.org>,
        Waiman Long <longman@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>,
        "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH 09/11] sched/rt: Fix proxy/current (push,pull)ability

On Mon, Oct 10, 2022 at 4:40 AM Valentin Schneider <vschneid@...hat.com> wrote:
>
> On 03/10/22 21:44, Connor O'Brien wrote:
> > From: Valentin Schneider <valentin.schneider@....com>
>
> This was one of my attempts at fixing RT load balancing (the BUG_ON in
> pick_next_pushable_task() was quite easy to trigger), but I ended up
> convincing myself this was insufficient - this only "tags" the donor and
> the proxy, the entire blocked chain needs tagging. Hopefully not all of
> what I'm about to write is nonsense, some of the neurons I need for this
> haven't been used in a while - to be taken with a grain of salt.
Made sense to me! Thanks, this was really helpful for understanding
the interactions between proxy execution & load balancing.
>
> Consider pick_highest_pushable_task() - we don't want any task in a blocked
> chain to be pickable. There's no point in migrating it, we'll just hit
> schedule()->proxy(), follow p->blocked_on and most likely move it back to
> where the rest of the chain is. This applies any sort of balancing (CFS,
> RT, DL).
>
> ATM I think PE breaks the "run the N highest priority task on our N CPUs"
> policy. Consider:
>
>    p0 (FIFO42)
>     |
>     | blocked_on
>     v
>    p1 (FIFO41)
>     |
>     | blocked_on
>     v
>    p2 (FIFO40)
>
>   Add on top p3 an unrelated FIFO1 task, and p4 an unrelated CFS task.
>
>   CPU0
>   current:  p0
>   proxy:    p2
>   enqueued: p0, p1, p2, p3
>
>   CPU1
>   current:  p4
>   proxy:    p4
>   enqueued: p4
>
>
> pick_next_pushable_task() on CPU0 would pick p1 as the next highest
> priority task to push away to e.g. CPU1, but that would be undone as soon
> as proxy() happens on CPU1: we'd notice the CPU boundary and punt it back
> to CPU0. What we would want here is to pick p3 instead to have it run on
> CPU1.

Given this point, is there any reason that blocked tasks should ever
be pushable, even if they are not part of the blocked chain for the
currently running task? If we could just check task_is_blocked()
rather than needing to know whether the task is in the middle of the
"active" chain, that would seem to simplify things greatly. I think
that approach might also require another dequeue/enqueue, in
ttwu_runnable(), to catch non-proxy blocked tasks becoming unblocked
(and therefore pushable), but that *seems* OK...though I could
certainly be missing something.

A related load balancing correctness question that caught my eye while
taking another look at this code: when we have rq->curr != rq->proxy
and then rq->curr is preempted and switches out, IIUC rq->curr should
become pushable immediately - but this does not seem to be the case
even with this patch. Is there a path that handles this case that I'm
just missing, or a reason that no special handling is needed?
Otherwise I wonder if __schedule() might need a dequeue/enqueue for
the prev task as well in this case.
>
> I *think* we want only the proxy of an entire blocked-chain to be visible
> to load-balance, unfortunately PE gathers the blocked-chain onto the
> donor's CPU which kinda undoes that.
>
> Having the blocked tasks remain in the rq is very handy as it directly
> gives us the scheduling context and we can unwind the blocked chain for the
> execution context, but it does wreak havock in load-balancing :/
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ