lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Sep 2017 08:02:23 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Alexey Dobriyan <adobriyan@...il.com>
Cc:     Michael Kerrisk <mtk.manpages@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Djalal Harouni <tixxdz@...il.com>,
        Alexey Gladkov <gladkov.alexey@...il.com>,
        Aliaksandr Patseyenak <Aliaksandr_Patseyenak1@...m.com>,
        Tatsiana Brouka <Tatsiana_Brouka@...m.com>
Subject: Re: [PATCH 1/2 v2] fdmap(2)

On Thu, Sep 28, 2017 at 3:55 AM, Alexey Dobriyan <adobriyan@...il.com> wrote:
> On 9/28/17, Michael Kerrisk (man-pages) <mtk.manpages@...il.com> wrote:
>> On 27 September 2017 at 17:03, Andy Lutomirski <luto@...capital.net> wrote:
>
>>>> The idea is to start process. In ideal world, only bynary system calls
>>>> would exist and shells could emulate /proc/* same way bash implement
>>>> /dev/tcp
>>>
>>> Then start the process by doing it for real and making it obviously
>>> useful.  We should not add a pair of vaguely useful, rather weak
>>> syscalls just to start a process of modernizing /proc.
>
> Before doing it for real it would be nice to have at least a nod
> from people in charge that syscalls which return binary
> information are OK. Otherwise some EIATF guy will just say
> "NAK /proc is fine, it always was fine".

There's nothing inherently wrong with syscalls that return binary
information.  There is something wrong with reinventing the world with
insufficient justification, though.

/proc may be old, clunky, and kind of slow, but it has a lot of good
things going for it.  It supports access control (DAC and MAC).  It
handles namespacing in a way that's awkward but supported by all the
existing namespace managers.  It may soon support mount options, which
is rather important.

I feel like we've been discussing this performance issue for over a
year, and I distinctly recall discussing it in Santa Fe.  I suggested
a two-pronged approach:

1. Add a new syscall that will, in a single call, open, read, and
close a proc file and maybe even a regular non-proc file.  Like this:

long readfileat(int dirfd, const char *path, void *buf, size_t len, int flags);

2. Where needed, add new /proc files with lighter-weight
representations.  I think we discussed something that's like status
but in nl_attr format.

This doesn't end up with a bunch of code duplication the way that a
full-blown syscall-based reimplementation would.  It supports all the
/proc features rather than just a subset.  It fully respects access
control, future mount options, and the potential lack of a /proc
mount.

>
> Or look from another angle: sched_setaffinity exists but there is
> no /proc counterpart, shells must use taskset(1) and world didn't end.

sched_setaffinity() modifies the caller.  /proc wouldn't have made much sense.

>
>> I concur.
>>
>> Alexey, you still have not wxplained who specifically needs this
>> right now, and how, precisely, they plan to use the new system calls.
>> It is all very arm-wavey so far.
>
> It is not if you read even example program in the original patch.
> Any program which queries information about file descriptors
> will benefit both in CPU and memory usage.
>
> void closefrom(int start)
> {
>         int fd[1024];
>         int n;
>
>         while ((n = fdmap(0, fd, sizeof(fd)/sizeof(fd[0]), start)) > 0) {
>                 unsigned int i;
>
>                 for (i = 0; i < n; i++)
>                         close(fd[i]);
>
>                 start = fd[n - 1] + 1;
>         }
> }
>
> CRIU naturally to know everything about descriptors of target processes:
> It does:
>
> int predump_task_files(int pid)
> {
>         struct dirent *de;
>         DIR *fd_dir;
>         int ret = -1;
>
>         pr_info("Pre-dump fds for %d)\n", pid);
>
>         fd_dir = opendir_proc(pid, "fd");
>         if (!fd_dir)
>                 return -1;
>
>         while ((de = readdir(fd_dir))) {
>                 if (dir_dots(de))
>                         continue;
>
>                 if (predump_one_fd(pid, atoi(de->d_name)))
>                         goto out;
>         }
>
>         ret = 0;
> out:
>         closedir(fd_dir);
>         return ret;
> }
>
> which is again inefficient.

And /proc/PID/fds would solve this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ