linux-kernel - Re: [RFC PATCH v2 4/5] sched: UMCG: add a blocked worker list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YeU0nr6DfBCaH6UF@hirez.programming.kicks-ass.net>
Date:   Mon, 17 Jan 2022 10:19:26 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Peter Oskolkov <posk@...gle.com>
Cc:     mingo@...hat.com, tglx@...utronix.de, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-api@...r.kernel.org, x86@...nel.org,
        pjt@...gle.com, avagin@...gle.com, jannh@...gle.com,
        tdelisle@...terloo.ca, posk@...k.io
Subject: Re: [RFC PATCH v2 4/5] sched: UMCG: add a blocked worker list

On Thu, Jan 13, 2022 at 03:39:39PM -0800, Peter Oskolkov wrote:
> The original idea of a UMCG server was that it was used as a proxy
> for a CPU, so if a worker associated with the server is RUNNING,
> the server itself is never ever was allowed to be RUNNING as well;
> when umcg_wait() returned for a server, it meant that its worker
> became BLOCKED.
> 
> In the new (old?) "per server runqueues" model implemented in
> the previous patch in this patchset, servers are woken when
> a previously blocked worker on their runqueue finishes its blocking
> operation, even if the currently RUNNING worker continues running.
> 
> As now a server may run while a worker assigned to it is running,
> the original idea of having at most a single worker RUNNING per
> server, as a means to control the number of running workers, is
> not really enforced, and the server, woken by a worker
> doing BLOCKED=>RUNNABLE transition, may then call sys_umcg_wait()
> with a second/third/etc. worker to run.
> 
> Support this scenario by adding a blocked worker list:
> when a worker transitions RUNNING=>BLOCKED, not only its server
> is woken, but the worker is also added to the blocked worker list
> of its server.
> 
> This change introduces the following benefits:
> - block detection how behaves similarly to wake detection;
>   without this patch worker wakeups added wakees to the list
>   and woke the server, while worker blocks only woke the server
>   without adding blocked workers to a list, forcing servers
>   to explicitly check worker's state;
> - if the blocked worker woke sufficiently quickly, the server
>   woken on the block event would observe its worker now as
>   RUNNABLE, so the block event had to be inferred rather than
>   explicitly signalled by the worker being added to the blocked
>   worker list;
> - it is now possible for a single server to control several
>   RUNNING workers, which makes writing userspace schedulers
>   simpler for smaller processes that do not need to scale beyond
>   one "server";
> - if the userspace wants to keep at most a single RUNNING worker
>   per server, and have multiple servers with their own runqueues,
>   this model is also naturally supported here.
> 
> So this change basically decouples block/wake detection from
> M:N threading in the sense that the number of servers is now
> does not have to be M or N, but is more driven by the scalability
> needs of the userspace application.

So I don't object to having this blocking list, we had that early on in
the discussions.

*However*, combined with WF_CURRENT_CPU this 1:N userspace model doesn't
really make sense, also combined with Proxy-Exec (if we ever get that
sorted) it will fundamentally not work.

More consideration is needed I think...