[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFTs51W6ZHrGaoXEbXNCkVKLxe7_S2raYcXMBzypC7VUDMrU-w@mail.gmail.com>
Date: Mon, 11 Oct 2021 15:45:36 -0700
From: Peter Oskolkov <posk@...k.io>
To: Thierry Delisle <tdelisle@...terloo.ca>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-api@...r.kernel.org, Paul Turner <pjt@...gle.com>,
Ben Segall <bsegall@...gle.com>,
Peter Oskolkov <posk@...gle.com>,
Andrei Vagin <avagin@...gle.com>, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH 5/5 v0.6] sched/umcg: add Documentation/userspace-api/umcg.txt
Hi Thierry,
sorry for the delayed reply - I'm finally going through the
documentation patches in preparation for the upcoming next version
patchset mail-out.
On Wed, Sep 22, 2021 at 11:39 AM Thierry Delisle <tdelisle@...terloo.ca> wrote:
>
> On 2021-09-17 2:03 p.m., Peter Oskolkov wrote:
> > [...]
> > +SYS_UMCG_WAIT()
> > +
> > +int sys_umcg_wait(uint32_t flags, uint64_t abs_timeout) operates on
> > +registered UMCG servers and workers: struct umcg_task *self provided to
> > +sys_umcg_ctl() when registering the current task is consulted in
> addition
> > +to flags and abs_timeout parameters.
> > +
> > +The function can be used to perform one of the three operations:
> > +
> > +* wait: if self->next_tid is zero, sys_umcg_wait() puts the current
> > + server or worker to sleep;
>
> I believe this description is misleading but I might be wrong.
> From the example
> * worker to server context switch (worker "yields"):
> S:IDLE+W:RUNNING => +S:RUNNING+W:IDLE
>
> It seems to me that when a worker goes from running to idle, it should
> *not* set the next_tid to 0, it should preserve the next_tid as-is,
> which is expected to point to its current server. This is consistent
> with my understanding of the umcg_wait implementation. This operation
> is effectively a direct context-switch to the server.
The documentation here outlines what sys_umcg_wait does, and it does
put the current task to sleep without context switching if next_tid is
zero. The question of whether this behavior is or is not appropriate
for a worker wishing to yield/park itself is at a "policy" level, if
you wish, and this "policy" level is described in "state transitions"
section later in the document. sys_umcg_wait() does not enforce this
"policy" directly, in order to make it simpler and easier to describe
and reason about.
>
> With that said, I'm a little confused by the usage of "yields" in that
> example. I would expect workers yielding to behave like kernel threads
> calling sched_yield(), i.e., context switch to the server but also be
> immediately added to the idle_workers_ptr.
>
> From my understanding of the umcg_wait call, "worker to server context
> switch" does not have analogous behaviour to sched_yield. Am I correct?
> If so, I suggest using "park" instead of "yield" in the description
> of that example. I believe the naming of wait/wake as park/unpark is
> consistent with Java[1] and Rust[2], but I don't know if that naming
> is used in contexts closer to the linux kernel.
>
> [1]
> https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/LockSupport.html
> [2] https://doc.rust-lang.org/std/thread/fn.park.html
I'm not a fan of arguing about how to name things. If the maintainers
ask me to rename wait/wake to park/unpark, I'll do that. But it seems
they are OK with this terminology, I believe because wait/wake is a
relatively well understood pair of verbs in the kernel context;
futexes, for example, have wait/wake operations. A higher level
library in the userspace may later expose park/unpark functions that
at the lower level call sys_umcg_wait...
Powered by blists - more mailing lists