linux-kernel - Re: [PATCH v2 2/2] mm: memcg: normalize the value passed into memcg_rstat

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkbsgDxTYED8Yp3DJ8e552Zou+DVEutHsr2PLXssm-V9YA@mail.gmail.com>
Date:   Tue, 3 Oct 2023 12:51:49 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Michal Koutný <mkoutny@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated()

On Tue, Oct 3, 2023 at 11:22 AM Michal Koutný <mkoutny@...e.com> wrote:
>
> On Fri, Sep 22, 2023 at 05:57:40PM +0000, Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > memcg_rstat_updated() uses the value of the state update to keep track
> > of the magnitude of pending updates, so that we only do a stats flush
> > when it's worth the work. Most values passed into memcg_rstat_updated()
> > are in pages, however, a few of them are actually in bytes or KBs.
> >
> > To put this into perspective, a 512 byte slab allocation today would
> > look the same as allocating 512 pages. This may result in premature
> > flushes, which means unnecessary work and latency.
> >
> > Normalize all the state values passed into memcg_rstat_updated() to
> > pages.
>
> I've dreamed about such normalization since error estimates were
> introduced :-)
>
> (As touched in the previous patch) it makes me wonder whether it makes
> sense to add up state and event counters (apples and oranges).

I conceptually agree that we are adding apples and oranges, but in
practice if you look at memcg_vm_event_stat, most stat updates
correspond to 1 page worth of something. I am guessing the implicit
assumption here is that event updates correspond roughly to page-sized
state updates. It's not perfect, but perhaps it's acceptable.

>
> Shouldn't with this approach events: a) have a separate counter, b)
> wight with zero and rely on time-based flushing only?

(a) If we have separate counters we need to eventually sum them to
figure out if the total magnitude of pending updates is worth flushing
anyway, right?
(b) I would be more nervous to rely on time-based flushing only as I
don't know how this can affect workloads.

>
> Thanks,
> Michal