linux-kernel - Re: [RESEND RFC PATCH 0/3] Provide fast access to thread specific data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <BEE6A914-6EB0-4649-9604-572728B04D70@oracle.com>
Date:   Tue, 14 Sep 2021 16:10:09 +0000
From:   Prakash Sangappa <prakash.sangappa@...cle.com>
To:     Peter Oskolkov <posk@...gle.com>
CC:     Jann Horn <jannh@...gle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-api <linux-api@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>, Paul Turner <pjt@...gle.com>,
        Peter Oskolkov <posk@...k.io>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RESEND RFC PATCH 0/3] Provide fast access to thread specific
 data

> On Sep 13, 2021, at 11:00 AM, Peter Oskolkov <posk@...gle.com> wrote:
> 
> On Mon, Sep 13, 2021 at 10:36 AM Prakash Sangappa
> <prakash.sangappa@...cle.com> wrote:
> 
> [...]
> 
>>> This sounds, again, as if the kernel should be aware of the kind of
>>> items being allocated; having a more generic mechanism of allocating
>>> pinned memory for the userspace to use at its discretion would be more
>>> generally useful, I think. But how then the kernel/system should be
>>> protected from a buggy or malicious process trying to grab too much?
>>> 
>>> One option would be to have a generic in-kernel mechanism for this,
>>> but expose it to the userspace via domain-specific syscalls that do
>>> the accounting you hint at. This sounds a bit like an over-engineered
>>> solution, though…
>> 
>> 
>> What will this pinned memory be used for in your use case,
>> can you explain?
> 
> For userspace scheduling, to share thread/task state information
> between the kernel and the userspace. This memory will be allocated
> per task/thread; both the kernel and the userspace will write to the
> shared memory, and these reads/writes will happen not only in the
> memory regions belonging to the "current" task/thread, but also to
> remote tasks/threads.
> 
> Somewhat detailed doc/rst is here:
> https://lore.kernel.org/lkml/20210908184905.163787-5-posk@google.com/

(Resending reply)

From what I could glean from the link above, looks like you will need the 
entire 'struct umcg_task’(which is 24 bytes in size) in the per thread shared
mapped space(pinned memory?) Accessed/updated both in  user space 
and kernel. Appears the state transitions here are specific to umcg.  So, 
may not be usable in other use cases that are interested in just checking 
if a thread is executing on cpu or blocked.

We have a requirement to share thread state as well(on or off cpu) in the 
shared structure, which also will be accessed by other threads in the user 
space. Kernel updates the state when the thread blocks or resumes execution.
Need to see if may be the task state you have could be repurposed when 
not used by umcg threads.

Regarding use of pinned memory, it is not arbitrary amount per thread then
right? Basically you need 24 bytes per thread. The proposed task_getshared() 
allocates pinned memory pages to accommodate  requests from as many 
threads in a process that need to use the shared structure
(padded to 128 bytes). The  amount of memory/pages consumed will be
bound by the number threads a process can create. As I mentioned in the
cover letter multiple shared structures are fit/allocated from a page.