linux-kernel - Re: [PATCH] mm: memcg: provide accurate stats for userspace reads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkYZxjAHrodVDK=wmz-sULJrq2VhC_5ecRP7T-KiaOcTuw@mail.gmail.com>
Date:   Fri, 11 Aug 2023 19:11:43 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Muchun Song <muchun.song@...ux.dev>, cgroups@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: memcg: provide accurate stats for userspace reads

On Fri, Aug 11, 2023 at 7:08 PM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> Hi all,
>
> (sorry for late response as I was away)
>
> On Fri, Aug 11, 2023 at 1:40 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> [...]
> > > > >
> > > > > Last note, for /proc/vmstat we have /proc/sys/vm/stat_refresh to trigger
> > > > > an explicit refresh. For those users who really need more accurate
> > > > > numbers we might consider interface like that. Or allow to write to stat
> > > > > file and do that in the write handler.
> > > >
> > > > This wouldn't be my first option, but if that's the only way to get
> > > > accurate stats I'll take it.
> > >
> > > To be honest, this would be my preferable option because of 2 reasons.
> > > a) we do not want to guarantee to much on the precision front because
> > > that would just makes maintainability much more harder with different
> > > people having a different opinion of how much precision is enough and b)
> > > it makes the more rare (need precise) case the special case rather than
> > > the default.
> >
> > How about we go with the proposed approach in this patch (or the mutex
> > approach as it's much cleaner), and if someone complains about slow
> > reads we revert the change and introduce the refresh API? We might
> > just get away with making all reads accurate and avoid the hassle of
> > updating some userspace readers to do write-then-read. We don't know
> > for sure that something will regress.
> >
> > What do you think?
>
> Actually I am with Michal on this one. As I see multiple regression
> reports for reading the stats, I am inclined towards rate limiting the
> sync stats flushing from user readable interfaces (through
> mem_cgroup_flush_stats_ratelimited()) and providing a separate
> interface as suggested by Michal to explicitly flush the stats for
> users ok with the cost. Since we flush the stats every 2 seconds, most
> of the users should be fine and the users who care about accuracy can
> pay for it.

I am worried that writing to a stat for flushing then reading will
increase the staleness window which we are trying to reduce here.
Would it be acceptable to add a separate interface to explicitly read
flushed stats without having to write first? If the distinction
disappears in the future we can just short-circuit both interfaces.