linux-kernel - Re: [PATCH v4 4/4] mm: memcg: use non-unified stats flushing for userspace reads

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJD7tka4zEcu-jMycMo0=xB7PP1j7P0gu_weGJSLQvbhYMzv9Q@mail.gmail.com>
Date:   Tue, 12 Sep 2023 04:09:28 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Tejun Heo <tj@...nel.org>, Wei Xu <weixugc@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>,
        Michal Koutný <mkoutny@...e.com>,
        Waiman Long <longman@...hat.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        Greg Thelen <gthelen@...gle.com>
Subject: Re: [PATCH v4 4/4] mm: memcg: use non-unified stats flushing for
 userspace reads

On Tue, Sep 12, 2023 at 4:03 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 11-09-23 10:21:24, Tejun Heo wrote:
> > Hello,
> >
> > On Mon, Sep 11, 2023 at 01:01:25PM -0700, Wei Xu wrote:
> > > Yes, it is the same test (10K contending readers). The kernel change
> > > is to remove stats_user_flush_mutex from mem_cgroup_user_flush_stats()
> > > so that the concurrent mem_cgroup_user_flush_stats() requests directly
> > > contend on cgroup_rstat_lock in cgroup_rstat_flush().
> >
> > I don't think it'd be a good idea to twist rstat and other kernel internal
> > code to accommodate 10k parallel readers.
>
> I didn't mean to suggest optimizing for this specific scenario. I was
> mostly curious whether the pathological case of unbound high latency due
> to lock dropping is easy to trigger by huge number of readers. It seems
> it is not and the mutex might not be really needed as a prevention.
>
> > If we want to support that, let's
> > explicitly support that by implementing better batching in the read path.
>
> Well, we need to be able to handle those situations because stat files
> are generally readable and we do not want unrelated workloads to
> influence each other heavily through this path.

I am working on a complete rework of this series based on the feedback
I got from Wei and the discussions here. I think I have something
simpler and more generic, and doesn't proliferate the number of
flushing variants we have. I am running some tests right now and will
share it as soon as I can.

It should address the high concurrency use case without adding a lot
of complexity. It basically involves a fast path where we only flush
the needed subtree if there's no contention, and a slow path where we
coalesce all flushing requests, and everyone just waits for a single
flush to complete (without spinning or contending on any locks). I am
trying to use this generic mechanism for both userspace reads and
in-kernel flushers. I am making sure in-kernel flushers do not
regress.

>
> [...]
>
> > When you have that many concurrent readers, most of them won't need to
> > actually flush.
>
> Agreed!
> --
> Michal Hocko
> SUSE Labs