[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrUJbVSPTpVqmHRTcELUCjAnQWVcsG7urcQgoHv13a+aOQ@mail.gmail.com>
Date: Wed, 27 Sep 2017 08:04:26 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Alexey Dobriyan <adobriyan@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Randy Dunlap <rdunlap@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Djalal Harouni <tixxdz@...il.com>,
Alexey Gladkov <gladkov.alexey@...il.com>,
Tatsiana Brouka <Tatsiana_Brouka@...m.com>,
Aliaksandr Patseyenak <Aliaksandr_Patseyenak1@...m.com>
Subject: Re: [PATCH v2 2/2] pidmap(2)
On Tue, Sep 26, 2017 at 11:46 AM, Alexey Dobriyan <adobriyan@...il.com> wrote:
> On Sun, Sep 24, 2017 at 02:27:00PM -0700, Andy Lutomirski wrote:
>> On Sun, Sep 24, 2017 at 1:08 PM, Alexey Dobriyan <adobriyan@...il.com> wrote:
>> > From: Tatsiana Brouka <Tatsiana_Brouka@...m.com>
>> >
>> > Implement system call for bulk retrieveing of pids in binary form.
>> >
>> > Using /proc is slower than necessary: 3 syscalls + another 3 for each thread +
>> > converting with atoi() + instantiating dentries and inodes.
>> >
>> > /proc may be not mounted especially in containers. Natural extension of
>> > hidepid=2 efforts is to not mount /proc at all.
>> >
>> > It could be used by programs like ps, top or CRIU. Speed increase will
>> > become more drastic once combined with bulk retrieval of process statistics.
>> >
>> > Benchmark:
>> >
>> > N=1<<16 times
>> > ~130 processes (~250 task_structs) on a regular desktop system
>> > opendir + readdir + closedir /proc + the same for every /proc/$PID/task
>> > (roughly what htop(1) does) vs pidmap
>> >
>> > /proc 16.80 ą 0.73%
>> > pidmap 0.06 ą 0.31%
>> >
>> > PIDMAP_* flags are modelled after /proc/task_diag patchset.
>> >
>> >
>> > PIDMAP(2) Linux Programmer's Manual PIDMAP(2)
>> >
>> > NAME
>> > pidmap - get allocated PIDs
>> >
>> > SYNOPSIS
>> > long pidmap(pid_t pid, int *pids, unsigned int count , unsigned int start, int flags);
>>
>> I think we will seriously regret a syscall that does this. Djalal is
>> working on fixing the turd that is hidepid, and this syscall is
>> basically incompatible with ever fixing hidepids. I think that, to
>> make it less regrettable, it needs to take an fd to a proc mount as a
>> parameter. This makes me wonder why it's a syscall at all -- why not
>> just create a new file like /proc/pids?
>
> See reply to fdmap(2).
>
> pidmap(2) is indeed more complex case exactly because of
> pid/tgid/tid/everything else + pidnamespaces + ->hide_pid.
> However the problem remains: query task tree without all the bullshit.
> C/R people succumbed with /proc/*/children, it was a mistake IMO.
Your syscall cannot be implemented sanely. It doesn't remove bullshit
-- it adds bullshit. NAK.
Powered by blists - more mailing lists