[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211208153420-mutt-send-email-mst@kernel.org>
Date: Wed, 8 Dec 2021 15:34:51 -0500
From: "Michael S. Tsirkin" <mst@...hat.com>
To: Mike Christie <michael.christie@...cle.com>
Cc: geert@...ux-m68k.org, vverma@...italocean.com, hdanton@...a.com,
hch@...radead.org, stefanha@...hat.com, jasowang@...hat.com,
sgarzare@...hat.com, virtualization@...ts.linux-foundation.org,
christian.brauner@...ntu.com, axboe@...nel.dk,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH V6 01/10] Use copy_process in vhost layer
On Mon, Nov 29, 2021 at 01:46:57PM -0600, Mike Christie wrote:
> The following patches made over Linus's tree, allow the vhost layer to do
> a copy_process on the thread that does the VHOST_SET_OWNER ioctl like how
> io_uring does a copy_process against its userspace app. This allows the
> vhost layer's worker threads to inherit cgroups, namespaces, address
> space, etc and this worker thread will also be accounted for against that
> owner/parent process's RLIMIT_NPROC limit.
>
> If you are not familiar with qemu and vhost here is more detailed
> problem description:
>
> Qemu will create vhost devices in the kernel which perform network, SCSI,
> etc IO and management operations from worker threads created by the
> kthread API. Because the kthread API does a copy_process on the kthreadd
> thread, the vhost layer has to use kthread_use_mm to access the Qemu
> thread's memory and cgroup_attach_task_all to add itself to the Qemu
> thread's cgroups.
>
> The problem with this approach is that we then have to add new functions/
> args/functionality for every thing we want to inherit. I started doing
> that here:
>
> https://lkml.org/lkml/2021/6/23/1233
>
> for the RLIMIT_NPROC check, but it seems it might be easier to just
> inherit everything from the beginning, becuase I'd need to do something
> like that patch several times.
So who's merging this? Me? Did all patches get acked by appropriate
maintainers?
> V6:
> - Rename kernel_worker to user_worker and fix prefixes.
> - Add better patch descriptions.
> V5:
> - Handle kbuild errors by building patchset against current kernel that
> has all deps merged. Also add patch to remove create_io_thread code as
> it's not used anymore.
> - Rebase patchset against current kernel and handle a new vm PF_IO_WORKER
> case added in 5.16-rc1.
> - Add PF_USER_WORKER flag so we can check it later after the initial
> thread creation for the wake up, vm and singal cses.
> - Added patch to auto reap the worker thread.
> V4:
> - Drop NO_SIG patch and replaced with Christian's SIG_IGN patch.
> - Merged Christian's kernel_worker_flags_valid helpers into patch 5 that
> added the new kernel worker functions.
> - Fixed extra "i" issue.
> - Added PF_USER_WORKER flag and added check that kernel_worker_start users
> had that flag set. Also dropped patches that passed worker flags to
> copy_thread and replaced with PF_USER_WORKER check.
> V3:
> - Add parentheses in p->flag and work_flags check in copy_thread.
> - Fix check in arm/arm64 which was doing the reverse of other archs
> where it did likely(!flags) instead of unlikely(flags).
> V2:
> - Rename kernel_copy_process to kernel_worker.
> - Instead of exporting functions, make kernel_worker() a proper
> function/API that does common work for the caller.
> - Instead of adding new fields to kernel_clone_args for each option
> make it flag based similar to CLONE_*.
> - Drop unused completion struct in vhost.
> - Fix compile warnings by merging vhost cgroup cleanup patch and
> vhost conversion patch.
> ~
>
Powered by blists - more mailing lists