lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250205-bauhof-fraktionslos-b1bedfe50db2@brauner>
Date: Wed, 5 Feb 2025 12:49:30 +0100
From: Christian Brauner <brauner@...nel.org>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Oleg Nesterov <oleg@...hat.com>, 
	Linus Torvalds <torvalds@...ux-foundation.org>, "Michael S. Tsirkin" <mst@...hat.com>, 
	"Eric W. Biederman" <ebiederm@...ssion.com>, linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [GIT PULL] KVM changes for Linux 6.14

On Tue, Feb 04, 2025 at 05:05:06PM +0100, Paolo Bonzini wrote:
> On Tue, Feb 4, 2025 at 3:19 PM Christian Brauner <brauner@...nel.org> wrote:
> >
> > On Mon, Jan 27, 2025 at 04:15:01PM +0100, Paolo Bonzini wrote:
> > > On Mon, Jan 27, 2025 at 3:10 PM Oleg Nesterov <oleg@...hat.com> wrote:
> > > > On 01/26, Linus Torvalds wrote:
> > > > > On Sun, 26 Jan 2025 at 10:54, Oleg Nesterov <oleg@...hat.com> wrote:
> > > > > >
> > > > > > I don't think we even need to detect the /proc/self/ or /proc/self-thread/
> > > > > > case, next_tid() can just check same_thread_group,
> > > > >
> > > > > That was my thinking yes.
> > > > >
> > > > > If we exclude them from /proc/*/task entirely, I'd worry that it would
> > > > > hide it from some management tool and be used for nefarious purposes
> > > >
> > > > Agreed,
> > > >
> > > > > (even if they then show up elsewhere that the tool wouldn't look at).
> > > >
> > > > Even if we move them from /proc/*/task to /proc ?
> > >
> > > Indeed---as long as they show up somewhere, it's not worse than it
> > > used to be. The reason why I'd prefer them to stay in /proc/*/task is
> > > that moving them away at least partly negates the benefits of the
> > > "workers are children of the starter" model. For example it
> > > complicates measuring their cost within the process that runs the VM.
> > > Maybe it's more of a romantic thing than a real practical issue,
> > > because in the real world resource accounting for VMs is done via
> > > cgroups. But unlike the lazy creation in KVM, which is overall pretty
> > > self-contained, I am afraid the ugliness in procfs would be much worse
> > > compared to the benefit, if there's a benefit at all.
> > >
> > > > Perhaps, I honestly do not know what will/can confuse userspace more.
> > >
> > > At the very least, marking workers as "Kthread: 1" makes sense and

You mean in /proc/<pid>/status? Yeah, we can do that. This expands the
definition of Kthread a bit. It would now mean anything that the kernel
spawned for userspace. But that is probably fine.

But it won't help with the problem of just checking /proc/<pid>/task/ to
figure out whether the caller is single-threaded or not. If the caller
has more than 1 entry in there they need to walk through all of them and
parse through /proc/<pid>/status to discount them if they're kernel
threads.

> > > should not cause too much confusion. I wouldn't go beyond that unless
> > > we get more reports of similar issues, and I'm not even sure how
> > > common it is for userspace libraries to check for single-threadedness.
> >
> > Sorry, just saw this thread now.
> >
> > What if we did what Linus suggests and hide (odd) user workers from
> > /proc/<pid>/task/* but also added /proc/<pid>/workers/*. The latter
> > would only list the workers that got spawned by the kernel for that
> > particular task? This would acknowledge their somewhat special status
> > and allow userspace to still detect them as "Hey, I didn't actually
> > spawn those, they got shoved into my workload by the kernel for me.".
> 
> Wouldn't the workers then disappear completely from ps, top or other
> tools that look at /proc/$PID/task? That seems a bit too underhanded
> towards userspace...

So maybe, but then there's also the possibility to do:

- Have /proc/<pid>/status list all tasks.
- Have /proc/<pid>/worker only list user workers spawned by the kernel for userspace.

count(/proc/<pid>/status) - count(/proc/<pid>/workers) == 1 => (userspace) single threaded

My wider point is that I would prefer we add something that is
consistent and doesn't have to give the caller a different view than a
foreign task. I think that will just create confusion in the long run.

Btw, checking whether single-threaded this way:

  fn is_single_threaded() -> io::Result<bool> {
      match count_dir_entries("/proc/self/task") {
          Ok(1) => Ok(true),
          Ok(_) => Ok(false),
          Err(e) => Err(e),
      }
  }

can be simplified. It should be sufficient to do:

stat("/proc/self/task", &st);
if ((st->st_nlink - 2) == 1)
	// single threaded

Since procfs adds the number of tasks to st_nlink (Which is a bit weird
given that /proc/<pid>/fd puts the number of file descriptors in
st->st_size.).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ