[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkY1cNcHpNdjXcG8EGCGLJP6+_kkkJYn-yGZ_hJLB6hGmA@mail.gmail.com>
Date: Tue, 28 Mar 2023 12:25:19 -0700
From: Yosry Ahmed <yosryahmed@...gle.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Tejun Heo <tj@...nel.org>, Josef Bacik <josef@...icpanda.com>,
Jens Axboe <axboe@...nel.dk>,
Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Koutný <mkoutny@...e.com>,
Vasily Averin <vasily.averin@...ux.dev>,
cgroups@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
bpf@...r.kernel.org
Subject: Re: [PATCH v1 7/9] workingset: memcg: sleep when flushing stats in workingset_refault()
On Tue, Mar 28, 2023 at 8:18 AM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Mon, Mar 27, 2023 at 11:16 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> > In workingset_refault(), we call mem_cgroup_flush_stats_ratelimited()
> > to flush stats within an RCU read section and with sleeping disallowed.
> > Move the call to mem_cgroup_flush_stats_ratelimited() above the RCU read
> > section and allow sleeping to avoid unnecessarily performing a lot of
> > work without sleeping.
> >
> > Since workingset_refault() is the only caller of
> > mem_cgroup_flush_stats_ratelimited(), just make it call the non-atomic
> > mem_cgroup_flush_stats().
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
>
> A nit below:
>
> Acked-by: Shakeel Butt <shakeelb@...gle.com>
>
> > ---
> > mm/memcontrol.c | 12 ++++++------
> > mm/workingset.c | 4 ++--
> > 2 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 57e8cbf701f3..0c0e74188e90 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -674,12 +674,6 @@ void mem_cgroup_flush_stats_atomic(void)
> > __mem_cgroup_flush_stats_atomic();
> > }
> >
> > -void mem_cgroup_flush_stats_ratelimited(void)
> > -{
> > - if (time_after64(jiffies_64, READ_ONCE(flush_next_time)))
> > - mem_cgroup_flush_stats_atomic();
> > -}
> > -
> > /* non-atomic functions, only safe from sleepable contexts */
> > static void __mem_cgroup_flush_stats(void)
> > {
> > @@ -695,6 +689,12 @@ void mem_cgroup_flush_stats(void)
> > __mem_cgroup_flush_stats();
> > }
> >
> > +void mem_cgroup_flush_stats_ratelimited(void)
> > +{
> > + if (time_after64(jiffies_64, READ_ONCE(flush_next_time)))
> > + mem_cgroup_flush_stats();
> > +}
> > +
> > static void flush_memcg_stats_dwork(struct work_struct *w)
> > {
> > __mem_cgroup_flush_stats();
> > diff --git a/mm/workingset.c b/mm/workingset.c
> > index af862c6738c3..7d7ecc46521c 100644
> > --- a/mm/workingset.c
> > +++ b/mm/workingset.c
> > @@ -406,6 +406,8 @@ void workingset_refault(struct folio *folio, void *shadow)
> > unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
> > eviction <<= bucket_order;
> >
> > + /* Flush stats (and potentially sleep) before holding RCU read lock */
>
> I think the only reason we use rcu lock is due to
> mem_cgroup_from_id(). Maybe we should add mem_cgroup_tryget_from_id().
> The other caller of mem_cgroup_from_id() in vmscan is already doing
> the same and could use mem_cgroup_tryget_from_id().
I think different callers of mem_cgroup_from_id() want different things.
(a) workingset_refault() reads the memcg from the id and doesn't
really care if the memcg is online or not.
(b) __mem_cgroup_uncharge_swap() reads the memcg from the id and drops
refs acquired on the swapout path. It doesn't need tryget as we should
know for a fact that we are holding refs from the swapout path. It
doesn't care if the memcg is online or not.
(c) mem_cgroup_swapin_charge_folio() reads the memcg from the id and
then gets a ref with css_tryget_online() -- so only if the refcount is
non-zero and the memcg is online.
So we would at least need mem_cgroup_tryget_from_id() and
mem_cgroup_tryget_online_from_id() to eliminate all direct calls of
mem_cgroup_from_id(). I am hesitant about (b) because if we use
mem_cgroup_tryget_from_id() the code will be getting a ref, then
dropping the ref we have been carrying from swapout, then dropping the
ref we just acquired.
WDYT?
>
> Though this can be done separately to this series (if we decide to do
> it at all).
Powered by blists - more mailing lists