linux-kernel - Re: [PATCH] cgroup: rstat: optimize flush through speculative test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YVtGHoboSix3rexr@slm.duckdns.org>
Date:   Mon, 4 Oct 2021 08:21:18 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Michal Koutný <mkoutny@...e.com>,
        Cgroups <cgroups@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cgroup: rstat: optimize flush through speculative test

On Mon, Oct 04, 2021 at 11:07:45AM -0700, Shakeel Butt wrote:
> > Sorry for being so slow but can you point to the exact call path which gets
> > slowed down so significantly?
> 
> This is the mem_cgroup_flush_stats() inside workingset_refault() in
> mm/workingset.c.

I see. Was looking at a repo which was too old.

> > I'm mostly wondering whether we need some sort
> > of time-batched flushes because even with lock avoidance the flush path
> > really isn't great when called frequently. We can mitigate it further if
> > necessary - e.g. by adding an "updated" bitmap so that the flusher doesn't
> > have to go around touching the cachelines for all the cpus.
> 
> For the memcg stats, I already proposed a batched flush at [1].
> 
> I actually did perform the same experiment with the proposed patch
> along with [1] and it improves around just 1%. More specifically for
> memcg stats [1] is good enough but that is memcg specific and this
> patch has merits on its own.

So, the current rstat code doesn't pay a lot of attention to optimizing the
read path - the reasoning being that as long as we avoid O(nr_cgroups), the
flush operations aren't frequent enough to be problematic. The use in
refault path seems to change that balance and it likely is worthwhile to
update rstat accordingly. As I mentioned above, a next step could be adding
a cpumask which tracks cpus with populated updated list, which should add
pretty small cost to the writers while making frequent flushes significantly
cheaper.

What do you think about that approach? While the proposed patch looks fine,
it kinda bothers me that it's a very partial optimization - ie. if flush
frequency is high enough for this to matter, that for_each_possible_cpu()
scanning loop really isn't appropriate.

Thanks.

-- 
tejun