lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvZod4ru7F38tAO-gM9ZFKaEhS0w3KqFbPwhwcTvgJs4xMUow@mail.gmail.com>
Date:   Mon, 9 Jan 2023 16:18:12 -0800
From:   Shakeel Butt <shakeelb@...gle.com>
To:     "T.J. Mercier" <tjmercier@...gle.com>
Cc:     Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Arve Hjønnevåg <arve@...roid.com>,
        Todd Kjos <tkjos@...roid.com>,
        Martijn Coenen <maco@...roid.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Christian Brauner <brauner@...nel.org>,
        Carlos Llamas <cmllamas@...gle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Paul Moore <paul@...l-moore.com>,
        James Morris <jmorris@...ei.org>,
        "Serge E. Hallyn" <serge@...lyn.com>,
        Stephen Smalley <stephen.smalley.work@...il.com>,
        Eric Paris <eparis@...isplace.org>, daniel.vetter@...ll.ch,
        android-mm@...gle.com, jstultz@...gle.com, cgroups@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        linaro-mm-sig@...ts.linaro.org, linux-mm@...ck.org,
        linux-security-module@...r.kernel.org, selinux@...r.kernel.org
Subject: Re: [PATCH 0/4] Track exported dma-buffers with memcg

Hi T.J.,

On Mon, Jan 9, 2023 at 1:38 PM T.J. Mercier <tjmercier@...gle.com> wrote:
>
> Based on discussions at LPC, this series adds a memory.stat counter for
> exported dmabufs. This counter allows us to continue tracking
> system-wide total exported buffer sizes which there is no longer any
> way to get without DMABUF_SYSFS_STATS, and adds a new capability to
> track per-cgroup exported buffer sizes. The total (root counter) is
> helpful for accounting in-kernel dmabuf use (by comparing with the sum
> of child nodes or with the sum of sizes of mapped buffers or FD
> references in procfs) in addition to helping identify driver memory
> leaks when in-kernel use continually increases over time. With
> per-application cgroups, the per-cgroup counter allows us to quickly
> see how much dma-buf memory an application has caused to be allocated.
> This avoids the need to read through all of procfs which can be a
> lengthy process, and causes the charge to "stick" to the allocating
> process/cgroup as long as the buffer is alive, regardless of how the
> buffer is shared (unless the charge is transferred).
>
> The first patch adds the counter to memcg. The next two patches allow
> the charge for a buffer to be transferred across cgroups which is
> necessary because of the way most dmabufs are allocated from a central
> process on Android. The fourth patch adds a SELinux hook to binder in
> order to control who is allowed to transfer buffer charges.
>
> [1] https://lore.kernel.org/all/20220617085702.4298-1-christian.koenig@amd.com/
>

I am a bit confused by the term "charge" used in this patch series.
>From the patches, it seems like only a memcg stat is added and nothing
is charged to the memcg.

This leads me to the question: Why add this stat in memcg if the
underlying memory is not charged to the memcg and if we don't really
want to limit the usage?

I see two ways forward:

1. Instead of memcg, use bpf-rstat [1] infra to implement the
per-cgroup stat for dmabuf. (You may need an additional hook for the
stat transfer).

2. Charge the actual memory to the memcg. Since the size of dmabuf is
immutable across its lifetime, you will not need to do accounting at
page level and instead use something similar to the network memory
accounting interface/mechanism (or even more simple). However you
would need to handle the reclaim, OOM and charge context and failure
cases. However if you are not looking to limit the usage of dmabuf
then this option is an overkill.

Please let me know if I misunderstood something.

[1] https://lore.kernel.org/all/20220824233117.1312810-1-haoluo@google.com/

thanks,
Shakeel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ