linux-kernel - Re: [PATCH v9 1/1] mm: report per-page metadata information

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65b77d3e-d683-1e90-ebb0-5c7758143048@google.com>
Date: Wed, 10 Apr 2024 16:58:26 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
cc: Sourav Panda <souravpanda@...gle.com>, corbet@....net, 
    gregkh@...uxfoundation.org, rafael@...nel.org, mike.kravetz@...cle.com, 
    muchun.song@...ux.dev, rppt@...nel.org, david@...hat.com, 
    rdunlap@...radead.org, chenlinxuan@...ontech.com, yang.yang29@....com.cn, 
    tomas.mudrunka@...il.com, bhelgaas@...gle.com, ivan@...udflare.com, 
    pasha.tatashin@...een.com, yosryahmed@...gle.com, hannes@...xchg.org, 
    shakeelb@...gle.com, kirill.shutemov@...ux.intel.com, 
    wangkefeng.wang@...wei.com, adobriyan@...il.com, 
    Vlastimil Babka <vbabka@...e.cz>, 
    "Liam R. Howlett" <Liam.Howlett@...cle.com>, surenb@...gle.com, 
    linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, 
    linux-doc@...r.kernel.org, linux-mm@...ck.org, 
    Matthew Wilcox <willy@...radead.org>, weixugc@...gle.com
Subject: Re: [PATCH v9 1/1] mm: report per-page metadata information

On Tue, 19 Mar 2024, Andrew Morton wrote:

> On Tue, 20 Feb 2024 13:45:58 -0800 Sourav Panda <souravpanda@...gle.com> wrote:
> 
> > Adds two new per-node fields, namely nr_memmap and nr_memmap_boot,
> > to /sys/devices/system/node/nodeN/vmstat and a global Memmap field
> > to /proc/meminfo. This information can be used by users to see how
> > much memory is being used by per-page metadata, which can vary
> > depending on build configuration, machine architecture, and system
> > use.
> 
> I yield to no man in my admiration of changelogging but boy, that's a
> lot of changelogging.  Would it be possible to consolidate the [0/N]
> coverletter and the [1/N] changelog into a single thing please?
> 
> >  Documentation/filesystems/proc.rst |  3 +++
> >  fs/proc/meminfo.c                  |  4 ++++
> >  include/linux/mmzone.h             |  4 ++++
> >  include/linux/vmstat.h             |  4 ++++
> >  mm/hugetlb_vmemmap.c               | 17 ++++++++++++----
> >  mm/mm_init.c                       |  3 +++
> >  mm/page_alloc.c                    |  1 +
> >  mm/page_ext.c                      | 32 +++++++++++++++++++++---------
> >  mm/sparse-vmemmap.c                |  8 ++++++++
> >  mm/sparse.c                        |  7 ++++++-
> >  mm/vmstat.c                        | 26 +++++++++++++++++++++++-
> >  11 files changed, 94 insertions(+), 15 deletions(-)
> 
> And yet we offer the users basically no documentation.  The new sysfs
> file should be documented under Documentation/ABI somewhere and
> perhaps we could prepare some more expansive user-facing documentation
> elsewhere?
> 

Sourav, is it possible to refresh this series into a v10 on top of the 
latest upstream kernel with a single condensed changelog that details the 
current behavior, what extension this is adding, and how it is generally 
useful?

As noted here, the cover letter has great material that discusses the 
rationale for this change but would be lost if only this patch is merged.  
So typically the cover letter material gets concatenated into the 
changelog, but in this case there's a lot of overlap.

A single patch that includes a succinct changelog would be awesome.

And then the requested documentation in Documentation/ABI either included 
in the same patch or as a second patch in the series?

I don't think the resulting patch series will actually need a cover letter 
after that, it will be able to stand on its own.

> I'd like to hear others' views on the overall usefulness/utility of this
> change, please?
> 

Likely true for all hyperscalers, the immediate use case that this could 
be applied to is to track boot memory overhead and any regression over 
time (across kernel upgrades, firmware upgrades, etc) that may change the 
amount of total memory available.  We'd want to subtract out the boot 
overhead that we know about (like struct page here) and then alert on any 
regression where we're losing memory from reboot to reboot for any reason.

This increased visibility into boot memory overhead allows us to create a 
mechanism to track changes over time when otherwise that attribution of 
that memory is not available.