linux-kernel - Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHaF1e0V=wAoNO36nRL2A5EaNnuQrvZ2K3wh6PL6FrwZQ@mail.gmail.com>
Date:   Mon, 11 Oct 2021 18:20:25 -0700
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Kees Cook <keescook@...omium.org>, Pavel Machek <pavel@....cz>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        David Hildenbrand <david@...hat.com>,
        John Hubbard <jhubbard@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Colin Cross <ccross@...gle.com>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Matthew Wilcox <willy@...radead.org>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Al Viro <viro@...iv.linux.org.uk>,
        Randy Dunlap <rdunlap@...radead.org>,
        Kalesh Singh <kaleshsingh@...gle.com>,
        Peter Xu <peterx@...hat.com>, rppt@...nel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        vincenzo.frascino@....com,
        Chinwen Chang (張錦文) 
        <chinwen.chang@...iatek.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Jann Horn <jannh@...gle.com>, apopple@...dia.com,
        Yu Zhao <yuzhao@...gle.com>, Will Deacon <will@...nel.org>,
        fenghua.yu@...el.com, thunder.leizhen@...wei.com,
        Hugh Dickins <hughd@...gle.com>, feng.tang@...el.com,
        Jason Gunthorpe <jgg@...pe.ca>, Roman Gushchin <guro@...com>,
        Thomas Gleixner <tglx@...utronix.de>, krisman@...labora.com,
        Chris Hyser <chris.hyser@...cle.com>,
        Peter Collingbourne <pcc@...gle.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Jens Axboe <axboe@...nel.dk>, legion@...nel.org,
        Rolf Eike Beer <eb@...ix.com>,
        Cyrill Gorcunov <gorcunov@...il.com>,
        Muchun Song <songmuchun@...edance.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Thomas Cedeno <thomascedeno@...gle.com>, sashal@...nel.org,
        cxfcosmos@...il.com, LKML <linux-kernel@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-mm <linux-mm@...ck.org>,
        kernel-team <kernel-team@...roid.com>,
        Tim Murray <timmurray@...gle.com>
Subject: Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting

On Mon, Oct 11, 2021 at 6:18 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> On Mon, Oct 11, 2021 at 1:36 AM Michal Hocko <mhocko@...e.com> wrote:
> >
> > On Fri 08-10-21 13:58:01, Kees Cook wrote:
> > > - Strings for "anon" specifically have no required format (this is good)
> > >   it's informational like the task_struct::comm and can (roughly)
> > >   anything. There's no naming convention for memfds, AF_UNIX, etc. Why
> > >   is one needed here? That seems like a completely unreasonable
> > >   requirement.
> >
> > I might be misreading the justification for the feature. Patch 2 is
> > talking about tools that need to understand memeory usage to make
> > further actions. Also Suren was suggesting "numbering convetion" as an
> > argument against.
> >
> > So can we get a clear example how is this being used actually? If this
> > is just to be used to debug by humans than I can see an argument for
> > human readable form. If this is, however, meant to be used by tools to
> > make some actions then the argument for strings is much weaker.
>
> The simplest usecase is when we notice that a process consumes more
> memory than usual and we do "cat /proc/$(pidof my_process)/maps" to
> check which area is contributing to this growth. The names we assign
> to anonymous areas are descriptive enough for a developer to get an
> idea where the increased consumption is coming from and how to proceed
> with their investigation.
> There are of course cases when tools are involved, but the end-user is
> always a human and the final report should contain easily
> understandable data.
>
> IIUC, the main argument here is whether the userspace can provide
> tools to perform the translations between ids and names, with the
> kernel accepting and reporting ids instead of strings. Technically
> it's possible, but to be practical that conversion should be fast
> because we will need to make name->id conversion potentially for each
> mmap. On the consumer side the performance is not as critical, but the
> fact that instead of dumping /proc/$pid/maps we will have to parse the
> file, do id->name conversion and replace all [anon:id] with
> [anon:name] would be an issue when we do that in bulk, for example
> when collecting system-wide data for a bugreport.
>
> I went ahead and implemented the proposed userspace solution involving
> tmpfs as a repository for name->id mapping (more precisely
> filename->inode mapping). Profiling shows that open()+fstat()+close()
> takes:
> - roughly 15 times longer than mmap() with 1000 unique names each
> being reused 50 times.
> - roughly 3 times longer than mmap() with 100 unique names each being
> reused 500 times. This is due to lstat() optimization suggested by
> Rasmus which avoids open() and close().
> For comparison, proposed prctl() takes roughly the same amount of time
> as mmap() and does not depend on the number of unique names.
>
> I'm still evaluating the proposal to use memfds but I'm not sure if
> the issue that David Hildenbrand mentioned about additional memory
> consumed in pagecache (which has to be addressed) is the only one we
> will encounter with this approach. If anyone knows of any potential
> issues with using memfds as named anonymous memory, I would really
> appreciate your feedback before I go too far in that direction.
> Thanks,
> Suren.

Just noticed that timmurray@ was dropped from the last reply. Adding him back.

>
> > --
> > Michal Hocko
> > SUSE Labs