lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJD7tkY1cNcHpNdjXcG8EGCGLJP6+_kkkJYn-yGZ_hJLB6hGmA@mail.gmail.com>
Date:   Tue, 28 Mar 2023 12:25:19 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Tejun Heo <tj@...nel.org>, Josef Bacik <josef@...icpanda.com>,
        Jens Axboe <axboe@...nel.dk>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Koutný <mkoutny@...e.com>,
        Vasily Averin <vasily.averin@...ux.dev>,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        bpf@...r.kernel.org
Subject: Re: [PATCH v1 7/9] workingset: memcg: sleep when flushing stats in workingset_refault()

On Tue, Mar 28, 2023 at 8:18 AM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Mon, Mar 27, 2023 at 11:16 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> >
> > In workingset_refault(), we call mem_cgroup_flush_stats_ratelimited()
> > to flush stats within an RCU read section and with sleeping disallowed.
> > Move the call to mem_cgroup_flush_stats_ratelimited() above the RCU read
> > section and allow sleeping to avoid unnecessarily performing a lot of
> > work without sleeping.
> >
> > Since workingset_refault() is the only caller of
> > mem_cgroup_flush_stats_ratelimited(), just make it call the non-atomic
> > mem_cgroup_flush_stats().
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
>
> A nit below:
>
> Acked-by: Shakeel Butt <shakeelb@...gle.com>
>
> > ---
> >  mm/memcontrol.c | 12 ++++++------
> >  mm/workingset.c |  4 ++--
> >  2 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 57e8cbf701f3..0c0e74188e90 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -674,12 +674,6 @@ void mem_cgroup_flush_stats_atomic(void)
> >                 __mem_cgroup_flush_stats_atomic();
> >  }
> >
> > -void mem_cgroup_flush_stats_ratelimited(void)
> > -{
> > -       if (time_after64(jiffies_64, READ_ONCE(flush_next_time)))
> > -               mem_cgroup_flush_stats_atomic();
> > -}
> > -
> >  /* non-atomic functions, only safe from sleepable contexts */
> >  static void __mem_cgroup_flush_stats(void)
> >  {
> > @@ -695,6 +689,12 @@ void mem_cgroup_flush_stats(void)
> >                 __mem_cgroup_flush_stats();
> >  }
> >
> > +void mem_cgroup_flush_stats_ratelimited(void)
> > +{
> > +       if (time_after64(jiffies_64, READ_ONCE(flush_next_time)))
> > +               mem_cgroup_flush_stats();
> > +}
> > +
> >  static void flush_memcg_stats_dwork(struct work_struct *w)
> >  {
> >         __mem_cgroup_flush_stats();
> > diff --git a/mm/workingset.c b/mm/workingset.c
> > index af862c6738c3..7d7ecc46521c 100644
> > --- a/mm/workingset.c
> > +++ b/mm/workingset.c
> > @@ -406,6 +406,8 @@ void workingset_refault(struct folio *folio, void *shadow)
> >         unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
> >         eviction <<= bucket_order;
> >
> > +       /* Flush stats (and potentially sleep) before holding RCU read lock */
>
> I think the only reason we use rcu lock is due to
> mem_cgroup_from_id(). Maybe we should add mem_cgroup_tryget_from_id().
> The other caller of mem_cgroup_from_id() in vmscan is already doing
> the same and could use mem_cgroup_tryget_from_id().

I think different callers of mem_cgroup_from_id() want different things.

(a) workingset_refault() reads the memcg from the id and doesn't
really care if the memcg is online or not.

(b) __mem_cgroup_uncharge_swap() reads the memcg from the id and drops
refs acquired on the swapout path. It doesn't need tryget as we should
know for a fact that we are holding refs from the swapout path. It
doesn't care if the memcg is online or not.

(c) mem_cgroup_swapin_charge_folio() reads the memcg from the id and
then gets a ref with css_tryget_online() -- so only if the refcount is
non-zero and the memcg is online.

So we would at least need mem_cgroup_tryget_from_id() and
mem_cgroup_tryget_online_from_id() to eliminate all direct calls of
mem_cgroup_from_id(). I am hesitant about (b) because if we use
mem_cgroup_tryget_from_id() the code will be getting a ref, then
dropping the ref we have been carrying from swapout, then dropping the
ref we just acquired.

 WDYT?


>
> Though this can be done separately to this series (if we decide to do
> it at all).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ