linux-kernel - Re: [RFC PATCH 4/4 v0.3] sched/umcg: RFC: implement UMCG syscalls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAPNVh5fug5cPu7gPoAR7ZiKzAZ5i8007=Hs9_MG+fCTL3XkLBQ@mail.gmail.com>
Date:   Mon, 26 Jul 2021 09:44:27 -0700
From:   Peter Oskolkov <posk@...gle.com>
To:     Thierry Delisle <tdelisle@...terloo.ca>
Cc:     Peter Oskolkov <posk@...k.io>, Andrei Vagin <avagin@...gle.com>,
        Ben Segall <bsegall@...gle.com>, Jann Horn <jannh@...gle.com>,
        Jim Newsome <jnewsome@...project.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        linux-api@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Paul Turner <pjt@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Buhr <pabuhr@...terloo.ca>
Subject: Re: [RFC PATCH 4/4 v0.3] sched/umcg: RFC: implement UMCG syscalls

On Fri, Jul 23, 2021 at 12:06 PM Thierry Delisle <tdelisle@...terloo.ca> wrote:
>
>  > In my tests reclaimed nodes have their next pointers immediately set
>  > to point to the list head. If the kernel gets a node with its @next
>  > pointing to something else, then yes, things break down (the kernel
>  > kills the process); this has happened occasionally when I had a bug in
>  > the userspace code.
>
> I believe that approach is fine for production, but for testing it may
> not detect some bugs. For example, it may not detect the race I detail
> below.

While I think I have the idle servers list working, I now believe that
what peterz@ was suggesting is not much slower in the common case
(many idle workers; few, if any, idle servers) than having a list of
idle servers exposed to the kernel: I think having a single idle
server at head, not a list, is enough: when a worker is added to idle
workers list, a single idle server at head, if present, can be
"popped" and woken; the userspace can maintain the list of idle
servers itself; having the kernel wake only one is enough - it will
pop all idle workers and decide whether any other servers are needed
to process the newly available work.

[...]

>  > Workers are trickier, as they can be woken by signals and then block
>  > again, but stray signals are so bad here that I'm thinking of actually
>  > not letting sleeping workers wake on signals. Other than signals
>  > waking queued/unqueued idle workers, are there any other potential
>  > races here?
>
> Timeouts on blocked threads is virtually the same as a signal I think. I
> can see that both could lead to attempts at waking workers that are not
> blocked.

I've got preemption working well enough to warrant a new RFC patchset
(also have timeouts done, but these were easy). I'll clean things up,
change the idle servers logic to only one idle server exposed to the
kernel, not a list, add some additional documentation (state
transitions, userspace code snippets, etc.) and will post v0.4 RFC
patchset to LKML later this week.

[...]