[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b0c90e6-6b38-4ff3-8778-1857cd66c206@redhat.com>
Date: Thu, 19 Dec 2024 23:57:40 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Keith Busch <kbusch@...nel.org>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
michael.christie@...cle.com, Tejun Heo <tj@...nel.org>,
Luca Boccassi <bluca@...ian.org>, Jens Axboe <axboe@...com>
Subject: Re: [PATCH] KVM: x86: switch hugepage recovery thread to vhost_task
On 12/19/24 23:23, Keith Busch wrote:
> On Thu, Dec 19, 2024 at 09:30:16PM +0100, Paolo Bonzini wrote:
>> On Thu, Dec 19, 2024 at 7:09 PM Keith Busch <kbusch@...nel.org> wrote:
>>>> Is crosvm trying to do anything but exec? If not, it should probably use the
>>>> flag.
>>>
>>> Good point, and I'm not sure right now. I don't think I know any crosvm
>>> developer experts but I'm working on that to get a better explanation of
>>> what's happening,
>>
>> Ok, I found the code and it doesn't exec (e.g.
>> https://github.com/google/crosvm/blob/b339d3d7/src/crosvm/sys/linux/jail_warden.rs#L122),
>> so that's not an option.
>
> Thanks, I was slowly getting there too. It's been a while since I had to
> work with the languange, so I'm a bit rusty (no pun intended) at
> navigating.
>
>> Well, if I understand correctly from a
>> cursory look at the code, crosvm is creating a jailed child process
>> early, and then spawns further jails through it; so it's just this
>> first process that has to cheat.
>>
>> One possibility on the KVM side is to delay creating the vhost_task
>> until the first KVM_RUN. I don't like it but...
>>
>> I think we should nevertheless add something to the status file in
>> procfs, that makes it easy to detect kernel tasks (PF_KTHREAD |
>> PF_IO_WORKER | PF_USER_WORKER).
>
> I currently think excluding kernel tasks from this check probably aligns
> with what it's trying to do, so anything to make that easier is a good
> step, IMO.
>
It could be as simple as this on the kernel side: [adding Jens for
a first look]
=============== 8< ===========
From: Paolo Bonzini <pbonzini@...hat.com>
Subject: [PATCH] fs: proc: mark user and I/O workers as "kernel threads"
A Rust library called "minijail" is looking at procfs to check if
the current task has multiple threads, and to prevent fork() if it
does. This is because fork() is in general ill-advised in
multi-threaded programs, for example if another thread might have
taken locks.
However, this attempt falls afoul of kernel threads that are children
of the user process that they serve. These are not a problem when
forking, but they are still present in procfs. The library should
discard them, but there is currently no way for userspace to detect
PF_USER_WORKER or PF_IO_WORKER threads.
The closest is the "Kthread" key in /proc/PID/task/TID/status. Extend
it instead of introducing another keyl tasks that are marked with
PF_USER_WORKER or PF_IO_WORKER are not kthreads, but they are close
enough for basically all intents and purposes.
Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 34a47fb0c57f..f702fb50c8ef 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -221,7 +221,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
#endif
seq_putc(m, '\n');
- seq_printf(m, "Kthread:\t%c\n", p->flags & PF_KTHREAD ? '1' : '0');
+ seq_printf(m, "Kthread:\t%c\n", p->flags & (PF_KTHREAD | PF_USER_WORKER | PF_IO_WORKER) ? '1' : '0');
}
void render_sigset_t(struct seq_file *m, const char *header,
Paolo
Powered by blists - more mailing lists