lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAPL-u9HHgPDj_xTTx=GqPg49DcrpGP1FF8zhaog=9awwu0f_Q@mail.gmail.com>
Date:   Thu, 2 Nov 2023 13:22:17 -0700
From:   Wei Xu <weixugc@...gle.com>
To:     Pasha Tatashin <pasha.tatashin@...een.com>
Cc:     David Hildenbrand <david@...hat.com>,
        Sourav Panda <souravpanda@...gle.com>, corbet@....net,
        gregkh@...uxfoundation.org, rafael@...nel.org,
        akpm@...ux-foundation.org, mike.kravetz@...cle.com,
        muchun.song@...ux.dev, rppt@...nel.org, rdunlap@...radead.org,
        chenlinxuan@...ontech.com, yang.yang29@....com.cn,
        tomas.mudrunka@...il.com, bhelgaas@...gle.com, ivan@...udflare.com,
        yosryahmed@...gle.com, hannes@...xchg.org, shakeelb@...gle.com,
        kirill.shutemov@...ux.intel.com, wangkefeng.wang@...wei.com,
        adobriyan@...il.com, vbabka@...e.cz, Liam.Howlett@...cle.com,
        surenb@...gle.com, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-mm@...ck.org, willy@...radead.org,
        Greg Thelen <gthelen@...gle.com>
Subject: Re: [PATCH v5 1/1] mm: report per-page metadata information

On Thu, Nov 2, 2023 at 11:34 AM Pasha Tatashin
<pasha.tatashin@...een.com> wrote:
>
> > > > I could have sworn that I pointed that out in a previous version and
> > > > requested to document that special case in the patch description. :)
> > >
> > > Sounds, good we will document that parts of per-page may not be part
> > > of MemTotal.
> >
> > But this still doesn't answer how we can use the new PageMetadata
> > field to help break down the runtime kernel overhead within MemUsed
> > (MemTotal - MemFree).
>
> I am not sure it matters to the end users: they look at PageMetadata
> with or without Page Owner, page_table_check, HugeTLB and it shows
> exactly how much per-page overhead changed. Where the kernel allocated
> that memory is not that important to the end user as long as that
> memory became available to them.
>
> In addition, it is still possible to estimate the actual memblock part
> of Per-page metadata by looking at /proc/zoneinfo:
>
> Memblock reserved per-page metadata: "present_pages - managed_pages"

This assumes that all reserved memblocks are per-page metadata. As I
mentioned earlier, it is not a robust approach.

> If there is something big that we will allocate in that range, we
> should probably also export it in some form.
>
> If this field does not fit in /proc/meminfo due to not fully being
> part of MemTotal, we could just keep it under nodeN/, as a separate
> file, as suggested by Greg.
>
> However, I think it is useful enough to have an easy system wide view
> for Per-page metadata.

It is fine to have this as a separate, informational sysfs file under
nodeN/, outside of meminfo. I just don't think as in the current
implementation (where PageMetadata is a mixture of buddy and memblock
allocations), it can help with the use case that motivates this
change, i.e. to improve the breakdown of the kernel overhead.

> > > > > are allocated), so what would be the best way to export page metadata
> > > > > without redefining MemTotal? Keep the new field in /proc/meminfo but
> > > > > be ok that it is not part of MemTotal or do two counters? If we do two
> > > > > counters, we will still need to keep one that is a buddy allocator in
> > > > > /proc/meminfo and the other one somewhere outside?
> > > >
> >
> > I think the simplest thing to do now is to only report the buddy
> > allocations of per-page metadata in meminfo.  The meaning of the new
>
> This will cause PageMetadata to be 0 on 99% of the systems, and
> essentially become useless to the vast majority of users.

I don't think it is a major issue. There are other fields (e.g. Zswap)
in meminfo that remain 0 when the feature is not used.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ