linux-kernel - Re: [PATCH 3/4 v0.4] sched/umcg: add Documentation/userspace-api/umcg.rst

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b6308f50-8e0a-c0bc-d25a-d8f515e5183d@uwaterloo.ca>
Date:   Mon, 9 Aug 2021 10:15:59 -0400
From:   Thierry Delisle <tdelisle@...terloo.ca>
To:     <posk@...gle.com>
CC:     <avagin@...gle.com>, <bsegall@...gle.com>, <jannh@...gle.com>,
        <linux-api@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <mingo@...hat.com>, <pabuhr@...terloo.ca>, <peterz@...radead.org>,
        <pjt@...gle.com>, <posk@...k.io>, <tdelisle@...terloo.ca>,
        <tglx@...utronix.de>
Subject: Re: [PATCH 3/4 v0.4] sched/umcg: add
 Documentation/userspace-api/umcg.rst

 > This is a wake, not a context switch, right?

I followed the "worker to worker context switch" procedure in the
documentation.

 > I'm not sure why you are concerned with context switching here. And even
 > if it were a context switch, the kernel manages thread stacks properly,
 > there's nothing to worry about.

The reason I'm interested in this particular operation is because the
outcome is decided by an invisible race (between W1 and W2) in the
kernel. W2 might context-switch to W1 and it might not. Note I don't mean
race in the problematic sense, just that there are two possible outcomes
that are decided by relative speed. I'm wondering how many outcomes the
users needs to program for and if they may have to back-track anything.

For example, if W2 wants to "yield to", it must enqueue itself in the
user scheduler before the system call. But if the system call doesn't
context-switch and W2 keeps running, it may need to undo the enqueue.

I agree the comment about the stack was a tangent and I expected the
kernel to handle it. But, I believe, how the kernel handles this case
affects the number of outcomes for this scenario.

 > If both cmpxchg() succeeded, but W1 was never put to sleep, ttwu()
 > will do nothing and W1 will continue running on its initial CPU, while
 > W2 will continue running on its own CPU. WF_CURRENT_CPU is an advisory
 > flag, and in this situation it will not do anything.

This does not sound right to me. If ttwu does nothing, W1 and W2 keep
running. Who sets W2's state back to RUNNING?

Is W2 responsible for doing that? It's not "the party initiating
the state transition" in this case.

Since there is no way for W2 to tell if it did context-switch to W1, does
that mean that W2 should always cmpxchg() its state to RUNNING after a
sys_umcg_wait()?