linux-kernel - Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPNVh5eBJB+QDr+gH4DvK1raho0tQx=w_LUFm5Gq7TVijoKrBg@mail.gmail.com>
Date:   Wed, 19 Jan 2022 09:33:15 -0800
From:   Peter Oskolkov <posk@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Peter Oskolkov <posk@...k.io>, mingo@...hat.com,
        tglx@...utronix.de, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-api@...r.kernel.org, x86@...nel.org,
        pjt@...gle.com, avagin@...gle.com, jannh@...gle.com,
        tdelisle@...terloo.ca
Subject: Re: [RFC][PATCH 3/3] sched: User Mode Concurency Groups

On Wed, Jan 19, 2022 at 12:47 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Jan 18, 2022 at 10:19:21AM -0800, Peter Oskolkov wrote:
> > ============= worker-to-worker context switches
> >
> > One example: absl::Mutex (https://abseil.io/about/design/mutex) has
> > google-internal extensions that are "fiber aware". More specifically,
> > consider this situation:
> >
> > - worker W1 acqured the mutex and is doing its work
> > - worker W2 calls mutex::lock()
> >   mutex::lock(), being aware of workers, understands that W2 is going to sleep;
> >   so instead of just doing so, waking the server, and letting
> >   the server figure out what to run in place of the sleeping worker,
> > mutex::lock()
> >   calls into the userspace scheduler in the context of W2 running, and the
> >   userspace scheduler then picks W3 to run and does W2->W3 context switch.
> >
> > The optimization above replaces W2->Server and Server->W3 context switches
> > with a single W2->W3 context switch, which is a material performance gain.
>
> Yes, I've also already reconsidered. Things like pipelines and other
> fixed order scheduling policies will greatly benefit from
> worker-to-worker switching.
>
> But I think all of them are explicit. That is, we can limit the
> ::next_tid usage to sys_umcg_wait() and never look at it for implicit
> blocks.

Yes, of course - when a worker blocks, its server gets notified.

>
> > In addition, when W1 calls mutex::unlock(), the scheduling code determines
> > that W2 is waiting on the mutex, and thus calls W2::wake() from the context of
> > running W1 (you asked earlier why do we need "WAKE_ONLY").
>
> This I'm not at all convinced on. That sounds like it will violate the
> 1:1 thing.

wake_only is a wakeup event, meaning the worker gets added to the wake
queue, not scheduled on a CPU; we don't have to implement it in the
kernel, though - the userspace may keep its own wake queue for workers
like this. So feel free to ignore this operation.