lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Apr 2021 17:50:06 +0800
From:   Muchun Song <songmuchun@...edance.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Roman Gushchin <guro@...com>, Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Xiongchun duan <duanxiongchun@...edance.com>,
        fam.zheng@...edance.com
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix root_mem_cgroup charging

On Wed, Apr 21, 2021 at 3:34 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Wed 21-04-21 14:26:44, Muchun Song wrote:
> > The below scenario can cause the page counters of the root_mem_cgroup
> > to be out of balance.
> >
> > CPU0:                                   CPU1:
> >
> > objcg = get_obj_cgroup_from_current()
> > obj_cgroup_charge_pages(objcg)
> >                                         memcg_reparent_objcgs()
> >                                             // reparent to root_mem_cgroup
> >                                             WRITE_ONCE(iter->memcg, parent)
> >     // memcg == root_mem_cgroup
> >     memcg = get_mem_cgroup_from_objcg(objcg)
> >     // do not charge to the root_mem_cgroup
> >     try_charge(memcg)
> >
> > obj_cgroup_uncharge_pages(objcg)
> >     memcg = get_mem_cgroup_from_objcg(objcg)
> >     // uncharge from the root_mem_cgroup
> >     page_counter_uncharge(&memcg->memory)
> >
> > This can cause the page counter to be less than the actual value,
> > Although we do not display the value (mem_cgroup_usage) so there
> > shouldn't be any actual problem, but there is a WARN_ON_ONCE in
> > the page_counter_cancel(). Who knows if it will trigger? So it
> > is better to fix it.
>
> The changelog doesn't explain the fix and why you have chosen to charge
> kmem objects to root memcg and left all other try_charge users intact.

The object cgroup is special (because the page can reparent). Only the
user of objcg APIs should be fixed.

> The reason is likely that those are not reparented now but that just
> adds an inconsistency.
>
> Is there any reason you haven't simply matched obj_cgroup_uncharge_pages
> to check for the root memcg and bail out early?

Because obj_cgroup_uncharge_pages() uncharges pages from the
root memcg unconditionally. Why? Because some pages can be
reparented to root memcg, in order to ensure the correctness of
page counter of root memcg. We have to uncharge pages from
root memcg. So we do not check whether the page belongs to
the root memcg when it uncharges. Based on this, we have
to make sure that the root memcg page counter is increased
when the page charged. I think the diagram (in the commit log) can
illustrate this problem well.

Thanks.

>
> > Signed-off-by: Muchun Song <songmuchun@...edance.com>
> > ---
> >  mm/memcontrol.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 1e68a9992b01..81b54bd9b9e0 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2686,8 +2686,8 @@ void mem_cgroup_handle_over_high(void)
> >       css_put(&memcg->css);
> >  }
> >
> > -static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > -                   unsigned int nr_pages)
> > +static int __try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > +                     unsigned int nr_pages)
> >  {
> >       unsigned int batch = max(MEMCG_CHARGE_BATCH, nr_pages);
> >       int nr_retries = MAX_RECLAIM_RETRIES;
> > @@ -2699,8 +2699,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >       bool drained = false;
> >       unsigned long pflags;
> >
> > -     if (mem_cgroup_is_root(memcg))
> > -             return 0;
> >  retry:
> >       if (consume_stock(memcg, nr_pages))
> >               return 0;
> > @@ -2880,6 +2878,15 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >       return 0;
> >  }
> >
> > +static inline int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> > +                          unsigned int nr_pages)
> > +{
> > +     if (mem_cgroup_is_root(memcg))
> > +             return 0;
> > +
> > +     return __try_charge(memcg, gfp_mask, nr_pages);
> > +}
> > +
> >  #if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MMU)
> >  static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
> >  {
> > @@ -3125,7 +3132,7 @@ static int obj_cgroup_charge_pages(struct obj_cgroup *objcg, gfp_t gfp,
> >
> >       memcg = get_mem_cgroup_from_objcg(objcg);
> >
> > -     ret = try_charge(memcg, gfp, nr_pages);
> > +     ret = __try_charge(memcg, gfp, nr_pages);
> >       if (ret)
> >               goto out;
> >
> > --
> > 2.11.0
>
> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ