[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170926184643.GC14724@avx2>
Date: Tue, 26 Sep 2017 21:46:43 +0300
From: Alexey Dobriyan <adobriyan@...il.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Randy Dunlap <rdunlap@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Djalal Harouni <tixxdz@...il.com>,
Alexey Gladkov <gladkov.alexey@...il.com>,
Tatsiana Brouka <Tatsiana_Brouka@...m.com>,
Aliaksandr Patseyenak <Aliaksandr_Patseyenak1@...m.com>
Subject: Re: [PATCH v2 2/2] pidmap(2)
On Sun, Sep 24, 2017 at 02:27:00PM -0700, Andy Lutomirski wrote:
> On Sun, Sep 24, 2017 at 1:08 PM, Alexey Dobriyan <adobriyan@...il.com> wrote:
> > From: Tatsiana Brouka <Tatsiana_Brouka@...m.com>
> >
> > Implement system call for bulk retrieveing of pids in binary form.
> >
> > Using /proc is slower than necessary: 3 syscalls + another 3 for each thread +
> > converting with atoi() + instantiating dentries and inodes.
> >
> > /proc may be not mounted especially in containers. Natural extension of
> > hidepid=2 efforts is to not mount /proc at all.
> >
> > It could be used by programs like ps, top or CRIU. Speed increase will
> > become more drastic once combined with bulk retrieval of process statistics.
> >
> > Benchmark:
> >
> > N=1<<16 times
> > ~130 processes (~250 task_structs) on a regular desktop system
> > opendir + readdir + closedir /proc + the same for every /proc/$PID/task
> > (roughly what htop(1) does) vs pidmap
> >
> > /proc 16.80 ± 0.73%
> > pidmap 0.06 ± 0.31%
> >
> > PIDMAP_* flags are modelled after /proc/task_diag patchset.
> >
> >
> > PIDMAP(2) Linux Programmer's Manual PIDMAP(2)
> >
> > NAME
> > pidmap - get allocated PIDs
> >
> > SYNOPSIS
> > long pidmap(pid_t pid, int *pids, unsigned int count , unsigned int start, int flags);
>
> I think we will seriously regret a syscall that does this. Djalal is
> working on fixing the turd that is hidepid, and this syscall is
> basically incompatible with ever fixing hidepids. I think that, to
> make it less regrettable, it needs to take an fd to a proc mount as a
> parameter. This makes me wonder why it's a syscall at all -- why not
> just create a new file like /proc/pids?
See reply to fdmap(2).
pidmap(2) is indeed more complex case exactly because of
pid/tgid/tid/everything else + pidnamespaces + ->hide_pid.
However the problem remains: query task tree without all the bullshit.
C/R people succumbed with /proc/*/children, it was a mistake IMO.
Powered by blists - more mailing lists