[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 9 Sep 2021 12:06:58 -0700
From: Peter Oskolkov <posk@...gle.com>
To: Jann Horn <jannh@...gle.com>
Cc: Peter Oskolkov <posk@...k.io>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, linux-api@...r.kernel.org,
Paul Turner <pjt@...gle.com>, Ben Segall <bsegall@...gle.com>,
Andrei Vagin <avagin@...gle.com>,
Thierry Delisle <tdelisle@...terloo.ca>
Subject: Re: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers
On Wed, Sep 8, 2021 at 4:39 PM Jann Horn <jannh@...gle.com> wrote:
Thanks a lot for the reviews, Jann!
I understand how to address most of your comments. However, one issue
I'm not sure what to do about:
[...]
> If this function is not allowed to sleep, as the comment says...
[...]
> ... then I'm pretty sure you can't call fix_pagefault() here, which
> acquires the mmap semaphore (which may involve sleeping) and then goes
> through the pagefault handling path (which can also sleep for various
> reasons, like allocating memory for pagetables, loading pages from
> disk / NFS / FUSE, and so on).
<quote from peterz@ from
https://lore.kernel.org/lkml/20210609125435.GA68187@worktop.programming.kicks-ass.net/>:
So a PF_UMCG_WORKER would be added to sched_submit_work()'s PF_*_WORKER
path to capture these tasks blocking. The umcg_sleeping() hook added
there would:
put_user(BLOCKED, umcg_task->umcg_status);
...
</quote>
Which is basically what I am doing here: in sched_submit_work() I need
to read/write to userspace; and we cannot sleep in
sched_submit_work(), I believe.
If you are right that it is impossible to deal with pagefaults from
within non-sleepable contexts, I see two options:
Option 1: as you suggest, pin pages holding struct umcg_task in sys_umcg_ctl;
or
Option 2: add more umcg-related kernel state to task_struct so that
reading/writing to userspace is not necessary in sched_submit_work().
The first option sounds much better from the code simplicity point of
view, but I'm not sure if it is a viable approach, i.e. I'm afraid
we'll get a hard NACK here, as a non-privileged process will be able
to force the kernel to pin a page per task/thread. We may get around
it by first pinning a limited number of pages, then having the
userspace allocate structs umcg_task on those pages, so that a pinned
page would cover more than a single task/thread. And have a sysctl
that limits the number of pinned pages per MM.
Peter Z., could you, please, comment here? Do you think pinning pages
to hold structs umcg_task is acceptable?
[...]
Powered by blists - more mailing lists