lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZOzBgfzlGdrPD4gk@dhcp22.suse.cz>
Date:   Mon, 28 Aug 2023 17:47:13 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     Yosry Ahmed <yosryahmed@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Shakeel Butt <shakeelb@...gle.com>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>, Tejun Heo <tj@...nel.org>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for
 userspace reads

Done my homework and studied the rstat code more (sorry should have done
that earlier).

On Fri 25-08-23 08:14:54, Yosry Ahmed wrote:
[...]
> I guess what I am trying to say is, breaking down that lock is a major
> surgery that might require re-designing or re-implementing some parts
> of rstat. I would be extremely happy to be proven wrong. If we can
> break down that lock then there is no need for unified flushing even
> for in-kernel contexts, and we can all live happily ever after with
> cheap(ish) and accurate stats flushing.

Yes, this seems like a big change and also over complicating the whole
thing. I am not sure this is worth it.

> I really hope we can move forward with the problems at hand (sometimes
> reads are expensive, sometimes reads are stale), and not block fixing
> them until we can come up with an alternative to that global lock
> (unless, of course, there is a simpler way of doing that).

Well, I really have to say that I do not like the notion that reading
stats is unpredictable. This just makes it really hard to use. If
the precision is to be sarificed then this should be preferable over
potentially high global lock contention. We already have that model in
place of /proc/vmstat (configurable timeout for flusher and a way to
flush explicitly). I appreciate you would like to have a better
precision but as you have explored the locking is really hard to get rid
of here.

So from my POV I would prefer to avoid flushing from the stats reading
path and implement force flushing by writing to stat file. If the 2s
flushing interval is considered to coarse I would be OK to allow setting
it from userspace. This way this would be more in line with /proc/vmstat
which seems to be working quite well.

If this is not accaptable or deemed a wrong approach long term then it
would be good to reonsider the current cgroup_rstat_lock at least.
Either by turning it into mutex or by dropping the yielding code which
can severly affect the worst case latency AFAIU.

> Sorry for the very long reply :)

Thanks for bearing with me and taking time to formulate all this!
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ