lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87plk8kj16.fsf@email.froward.int.ebiederm.org>
Date: Sun, 26 Jan 2025 21:55:17 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>,  "Michael S. Tsirkin"
 <mst@...hat.com>,  Christian Brauner <brauner@...nel.org>,  Oleg Nesterov
 <oleg@...hat.com>,  linux-kernel@...r.kernel.org,  kvm@...r.kernel.org
Subject: Re: [GIT PULL] KVM changes for Linux 6.14

Linus Torvalds <torvalds@...ux-foundation.org> writes:

> On Sat, 25 Jan 2025 at 10:12, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> Arguably the user space oddity is just strange and Paolo even calls it
>> a bug, but at the same time, I do think user space can and should
>> reasonably expect that it only has children that it created
>> explicitly [..]
>
> Note that I think that doing things like "io_uring" and getting IO
> helper threads that way would very much count as "explicit children",
> so I don't argue that all kernel helper threads would fall under this
> category.
>
> And I suspect that the normal vhost workers fall under that same kind
> of "it's like io_uring". If you use VHOST_NEW_WORKER to create a
> worker thread, then that's a pretty explicit "I have a child process".
>
> So it's really just that hugepage recovery thread that seems to be a
> bit "too" much of an implicit kernel helper thread that user space
> kind of gets accidentally and implicitly just because of a kernel
> implementation detail.
>
> I'm sure the kvm hack to just start it later (at KVM_RUN time?) is
> sufficient in practice, but it still feels conceptually iffy to me.

I don't think implicit vs explicit is right question.  Rather we should
be asking can userspace care?

If I read the context from the commit correctly what userspace
is asking is:  Am I single threaded so that I know nothing funny
will happen in the forked process.

The most common funny I am aware of for forked multi-threaded processes
is that if they fork with another thread holding a lock the forked
process might hang forever on the lock because the lock will never
be released.

The most interesting part of the hugepage reaper appears to be
kvm_mmu_commit_zap_page, where a page is freed after being flushed from
the tlb.

I would argue that if kvm_mmu_commit_zap_page and friends change the
page tables in a way that userspace can see after a fork, and in turn
could affect how the forked process will execute userspace is doing
something sensible in testing for it.

On the flip side if this isn't something userspace can observe in it's
own process I would argue that the proper solution is to user a regular
kthread.

In summary the conceptually clean approach is to only have threads that
when running can effect the process they are a part of in a userspace
visible way.  Assuming the hugepage reaper can effect the process it is
a part of, the only problem I see is the hugepage reaper existing when
it had nothing it could possibly do.

I don't think hiding threads is a useful solution because the threads
will effect they process they are a part of.  If the threads aren't
effecting the process they are a part of we have other solutions besides
threads.

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ