lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 27 Jan 2021 12:16:23 -0500
From:   Kalesh Singh <kaleshsingh@...gle.com>
To:     Jann Horn <jannh@...gle.com>
Cc:     Suren Baghdasaryan <surenb@...gle.com>,
        Minchan Kim <minchan@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Hridya Valsaraju <hridya@...gle.com>,
        kernel-team <kernel-team@...roid.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Jonathan Corbet <corbet@....net>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        Mauro Carvalho Chehab <mchehab+huawei@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Kees Cook <keescook@...omium.org>,
        Alexey Gladkov <gladkov.alexey@...il.com>,
        Szabolcs Nagy <szabolcs.nagy@....com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        Michel Lespinasse <walken@...gle.com>,
        Bernd Edlinger <bernd.edlinger@...mail.de>,
        Andrei Vagin <avagin@...il.com>,
        Yafang Shao <laoar.shao@...il.com>, Hui Su <sh_def@....com>,
        kernel list <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        Linux Media Mailing List <linux-media@...r.kernel.org>,
        DRI mailing list <dri-devel@...ts.freedesktop.org>,
        linaro-mm-sig@...ts.linaro.org,
        Jeffrey Vander Stoep <jeffv@...gle.com>,
        Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH] procfs/dmabuf: Add /proc/<pid>/task/<tid>/dmabuf_fds

On Wed, Jan 27, 2021 at 5:47 AM Jann Horn <jannh@...gle.com> wrote:
>
> +jeffv from Android
>
> On Tue, Jan 26, 2021 at 11:51 PM Kalesh Singh <kaleshsingh@...gle.com> wrote:
> > In order to measure how much memory a process actually consumes, it is
> > necessary to include the DMA buffer sizes for that process in the memory
> > accounting. Since the handle to DMA buffers are raw FDs, it is important
> > to be able to identify which processes have FD references to a DMA buffer.
>
> Or you could try to let the DMA buffer take a reference on the
> mm_struct and account its size into the mm_struct? That would probably
> be nicer to work with than having to poke around in procfs separately
> for DMA buffers.
>
> > Currently, DMA buffer FDs can be accounted using /proc/<pid>/fd/* and
> > /proc/<pid>/fdinfo -- both of which are only root readable, as follows:
>
> That's not quite right. They can both also be accessed by the user
> owning the process. Also, fdinfo is a standard interface for
> inspecting process state that doesn't permit reading process memory or
> manipulating process state - so I think it would be fine to permit
> access to fdinfo under a PTRACE_MODE_READ_FSCRED check, just like the
> interface you're suggesting.


Hi everyone. Thank you for the feedback.

I understand there is a deeper problem of accounting shared memory in
the kernel, that’s not only specific to the DMA buffers. In this case
DMA buffers, I think Jann’s proposal is the cleanest way to attribute
the shared buffers to processes. I can respin a patch modifying fdinfo
as suggested, if this is not an issue from a security perspective.

Thanks,
Kalesh

>
>
> >   1. Do a readlink on each FD.
> >   2. If the target path begins with "/dmabuf", then the FD is a dmabuf FD.
> >   3. stat the file to get the dmabuf inode number.
> >   4. Read/ proc/<pid>/fdinfo/<fd>, to get the DMA buffer size.
> >
> > Android captures per-process system memory state when certain low memory
> > events (e.g a foreground app kill) occur, to identify potential memory
> > hoggers. To include a process’s dmabuf usage as part of its memory state,
> > the data collection needs to be fast enough to reflect the memory state at
> > the time of such events.
> >
> > Since reading /proc/<pid>/fd/ and /proc/<pid>/fdinfo/ requires root
> > privileges, this approach is not suitable for production builds.
>
> It should be easy to add enough information to /proc/<pid>/fdinfo/ so
> that you don't need to look at /proc/<pid>/fd/ anymore.
>
> > Granting
> > root privileges even to a system process increases the attack surface and
> > is highly undesirable. Additionally this is slow as it requires many
> > context switches for searching and getting the dma-buf info.
>
> What do you mean by "context switches"? Task switches or kernel/user
> transitions (e.g. via syscall)?
>
> > With the addition of per-buffer dmabuf stats in sysfs [1], the DMA buffer
> > details can be queried using their unique inode numbers.
> >
> > This patch proposes adding a /proc/<pid>/task/<tid>/dmabuf_fds interface.
> >
> > /proc/<pid>/task/<tid>/dmabuf_fds contains a list of inode numbers for
> > every DMA buffer FD that the task has. Entries with the same inode
> > number can appear more than once, indicating the total FD references
> > for the associated DMA buffer.
> >
> > If a thread shares the same files as the group leader then its
> > dmabuf_fds file will be empty, as these dmabufs are reported by the
> > group leader.
> >
> > The interface requires PTRACE_MODE_READ_FSCRED (same as /proc/<pid>/maps)
> > and allows the efficient accounting of per-process DMA buffer usage without
> > requiring root privileges. (See data below)
>
> I'm not convinced that introducing a new procfs file for this is the
> right way to go. And the idea of having to poke into multiple
> different files in procfs and in sysfs just to be able to compute a
> proper memory usage score for a process seems weird to me. "How much
> memory is this process using" seems like the kind of question the
> kernel ought to be able to answer (and the kernel needs to be able to
> answer somewhat accurately so that its own OOM killer can do its job
> properly)?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ