[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZN0eqq4hLRYQPHCI@slm.duckdns.org>
Date: Wed, 16 Aug 2023 09:08:26 -1000
From: Tejun Heo <tj@...nel.org>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Yosry Ahmed <yosryahmed@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Ivan Babrou <ivan@...udflare.com>
Subject: Re: [PATCH] mm: memcg: provide accurate stats for userspace reads
Hello,
On Wed, Aug 16, 2023 at 10:11:20AM -0700, Shakeel Butt wrote:
> These options are not white and black and there can be something in
> between but let me be very clear on what I don't want and would NACK.
I'm not a big fan of interfaces with hidden states. What you're proposing
isn't strictly that but it's still a bit nasty. So, if we can get by without
doing that, that'd be great.
> I don't want a global sleepable lock which can be taken by potentially
> any application running on the system. We have seen similar global
> locks causing isolation and priority inversion issues in production.
> So, not another lock which needs to be taken under extreme condition
> (reading stats under OOM) by a high priority task (node controller)
> and might be held by a low priority task.
Yeah, this is a real concern. Those priority inversions do occur and can be
serious but causing serious problems under memory pressure usually requires
involving memory allocations and IOs. Here, it's just all CPU. So, at least
in OOM conditions, this shouldn't be in the way (the system wouldn't have
anything else to do anyway).
It is true that this still can lead to priority through CPU competition tho.
However, that problem isn't necessarily solved by what you're suggesting
either unless you want to restrict explicit flushing based on permissions
which is another can of worms.
My preference is not exposing this in user interface. This is mostly arising
from internal implementation details and isn't what users necessarily care
about. There are many things we can do on the kernel side to make trade-offs
among overhead, staleness and priority inversions. If we make this an
explicit userland interface behavior, we get locked into that semantics
which we'll likely regret in some future.
Thanks.
--
tejun
Powered by blists - more mailing lists