lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALvZod4ea4fR2n1EdZ3HwB3O3iWDHw5nXRnPLKbR6mAuDkWuQA@mail.gmail.com>
Date:   Fri, 4 Jan 2019 16:47:44 -0800
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Yang Shi <yang.shi@...ux.alibaba.com>
Cc:     Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [v2 PATCH 3/5] mm: memcontrol: introduce wipe_on_offline interface

On Fri, Jan 4, 2019 at 4:21 PM Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
> We have some usecases which create and remove memcgs very frequently,
> and the tasks in the memcg may just access the files which are unlikely
> accessed by anyone else.  So, we prefer force_empty the memcg before
> rmdir'ing it to reclaim the page cache so that they don't get
> accumulated to incur unnecessary memory pressure.  Since the memory
> pressure may incur direct reclaim to harm some latency sensitive
> applications.
>
> Force empty would help out such usecase, however force empty reclaims
> memory synchronously when writing to memory.force_empty.  It may take
> some time to return and the afterwards operations are blocked by it.
> Although this can be done in background, some usecases may need create
> new memcg with the same name right after the old one is deleted.  So,
> the creation might get blocked by the before reclaim/remove operation.
>
> Delaying memory reclaim in cgroup offline for such usecase sounds
> reasonable.  Introduced a new interface, called wipe_on_offline for both
> default and legacy hierarchy, which does memory reclaim in css offline
> kworker.
>
> Writing to 1 would enable it, writing 0 would disable it.
>
> Suggested-by: Michal Hocko <mhocko@...e.com>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Signed-off-by: Yang Shi <yang.shi@...ux.alibaba.com>
> ---
>  include/linux/memcontrol.h |  3 +++
>  mm/memcontrol.c            | 49 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+)
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 83ae11c..2f1258a 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -311,6 +311,9 @@ struct mem_cgroup {
>         struct list_head event_list;
>         spinlock_t event_list_lock;
>
> +       /* Reclaim as much as possible memory in offline kworker */
> +       bool wipe_on_offline;
> +
>         struct mem_cgroup_per_node *nodeinfo[0];
>         /* WARNING: nodeinfo must be the last member here */
>  };
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 75208a2..5a13c6b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2918,6 +2918,35 @@ static ssize_t mem_cgroup_force_empty_write(struct kernfs_open_file *of,
>         return mem_cgroup_force_empty(memcg) ?: nbytes;
>  }
>
> +static int wipe_on_offline_show(struct seq_file *m, void *v)
> +{
> +       struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
> +
> +       seq_printf(m, "%lu\n", (unsigned long)memcg->wipe_on_offline);
> +
> +       return 0;
> +}
> +
> +static int wipe_on_offline_write(struct cgroup_subsys_state *css,
> +                                struct cftype *cft, u64 val)
> +{
> +       int ret = 0;
> +
> +       struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +
> +       if (mem_cgroup_is_root(memcg))
> +               return -EINVAL;
> +
> +       if (val == 0)
> +               memcg->wipe_on_offline = false;
> +       else if (val == 1)
> +               memcg->wipe_on_offline = true;
> +       else
> +               ret = -EINVAL;
> +
> +       return ret;
> +}
> +
>  static u64 mem_cgroup_hierarchy_read(struct cgroup_subsys_state *css,
>                                      struct cftype *cft)
>  {
> @@ -4283,6 +4312,11 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
>                 .write = mem_cgroup_reset,
>                 .read_u64 = mem_cgroup_read_u64,
>         },
> +       {
> +               .name = "wipe_on_offline",

What about "force_empty_on_offline"?

> +               .seq_show = wipe_on_offline_show,
> +               .write_u64 = wipe_on_offline_write,
> +       },
>         { },    /* terminate */
>  };
>
> @@ -4569,6 +4603,15 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
>         page_counter_set_min(&memcg->memory, 0);
>         page_counter_set_low(&memcg->memory, 0);
>
> +       /*
> +        * Reclaim as much as possible memory when offlining.
> +        *
> +        * Do it after min/low is reset otherwise some memory might
> +        * be protected by min/low.
> +        */
> +       if (memcg->wipe_on_offline)
> +               mem_cgroup_force_empty(memcg);
> +

mem_cgroup_force_empty() also does drain_all_stock(), so, move
drain_all_stock() in mem_cgroup_css_offline() to the else of 'if
(memcg->wipe_on_offline)'.

>         memcg_offline_kmem(memcg);
>         wb_memcg_offline(memcg);
>
> @@ -5694,6 +5737,12 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of,
>                 .seq_show = memory_oom_group_show,
>                 .write = memory_oom_group_write,
>         },
> +       {
> +               .name = "wipe_on_offline",
> +               .flags = CFTYPE_NOT_ON_ROOT,
> +               .seq_show = wipe_on_offline_show,
> +               .write_u64 = wipe_on_offline_write,
> +       },
>         { }     /* terminate */
>  };
>
> --
> 1.8.3.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ