netdev - Re: [External] Re: [PATCH] mm/memcontrol: Add the drop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALvZod64Qwzjv3N2PO-EUtMkA4bs_PM=Tq4=cmuM0VO9P3BAjw@mail.gmail.com>
Date:   Tue, 22 Sep 2020 12:57:19 -0700
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Chunxin Zang <zangchunxin@...edance.com>
Cc:     Chris Down <chris@...isdown.name>, Michal Hocko <mhocko@...e.com>,
        Yafang Shao <laoar.shao@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
        Jonathan Corbet <corbet@....net>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>, kafai@...com,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        andriin@...com, john.fastabend@...il.com, kpsingh@...omium.org,
        Cgroups <cgroups@...r.kernel.org>, linux-doc@...r.kernel.org,
        Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>, bpf@...r.kernel.org
Subject: Re: [External] Re: [PATCH] mm/memcontrol: Add the drop_cache
 interface for cgroup v2

On Tue, Sep 22, 2020 at 5:37 AM Chunxin Zang <zangchunxin@...edance.com> wrote:
>
> On Tue, Sep 22, 2020 at 6:42 PM Chris Down <chris@...isdown.name> wrote:
> >
> > Chunxin Zang writes:
> > >On Tue, Sep 22, 2020 at 5:51 PM Chris Down <chris@...isdown.name> wrote:
> > >>
> > >> Chunxin Zang writes:
> > >> >My usecase is that there are two types of services in one server. They
> > >> >have difference
> > >> >priorities. Type_A has the highest priority, we need to ensure it's
> > >> >schedule latency、I/O
> > >> >latency、memory enough. Type_B has the lowest priority, we expect it
> > >> >will not affect
> > >> >Type_A when executed.
> > >> >So Type_A could use memory without any limit. Type_B could use memory
> > >> >only when the
> > >> >memory is absolutely sufficient. But we cannot estimate how much
> > >> >memory Type_B should
> > >> >use. Because everything is dynamic. So we can't set Type_B's memory.high.
> > >> >
> > >> >So we want to release the memory of Type_B when global memory is
> > >> >insufficient in order
> > >> >to ensure the quality of service of Type_A . In the past, we used the
> > >> >'force_empty' interface
> > >> >of cgroup v1.
> > >>
> > >> This sounds like a perfect use case for memory.low on Type_A, and it's pretty
> > >> much exactly what we invented it for. What's the problem with that?
> > >
> > >But we cannot estimate how much memory Type_A uses at least.
> >
> > memory.low allows ballparking, you don't have to know exactly how much it uses.
> > Any amount of protection biases reclaim away from that cgroup.
> >
> > >For example:
> > >total memory: 100G
> > >At the beginning, Type_A was in an idle state, and it only used 10G of memory.
> > >The load is very low. We want to run Type_B to avoid wasting machine resources.
> > >When Type_B runs for a while, it used 80G of memory.
> > >At this time Type_A is busy, it needs more memory.
> >
> > Ok, so set memory.low for Type_A close to your maximum expected value.
>
> Please forgive me for not being able to understand why setting
> memory.low for Type_A can solve the problem.
> In my scene, Type_A is the most important, so I will set 100G to memory.low.
> But 'memory.low' only takes effect passively when the kernel is
> reclaiming memory. It means that reclaim Type_B's memory only when
> Type_A  in alloc memory slow path. This will affect Type_A's
> performance.
> We want to reclaim Type_B's memory in advance when A is expected to be busy.
>

How will you know when to reclaim from B? Are you polling /proc/meminfo?

>From what I understand, you want to proactively reclaim from B, so
that A does not go into global reclaim and in the worst case kill B,
right?

BTW you can use memory.high to reclaim from B by setting it lower than
memory.current of B and reset it to 'max' once the reclaim is done.
Since 'B' is not high priority (I am assuming not a latency sensitive
workload), B hitting temporary memory.high should not be an issue.
Also I am assuming you don't much care about the amount of memory to
be reclaimed from B, so I think memory.high can fulfil your use-case.
However if in future you decide to proactively reclaim from all the
jobs based on their priority i.e. more aggressive reclaim from B and a
little bit reclaim from A then memory.high is not a good interface.

Shakeel