linux-kernel - Re: [PATCH v5 2/2] mm/memcontrol.c: Reduce reclaim retries in mem_cgroup_resize

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180222154435.GO30681@dhcp22.suse.cz>
Date:   Thu, 22 Feb 2018 16:44:35 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Andrey Ryabinin <aryabinin@...tuozzo.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Cgroups <cgroups@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>
Subject: Re: [PATCH v5 2/2] mm/memcontrol.c: Reduce reclaim retries in
 mem_cgroup_resize_limit()

On Thu 22-02-18 18:38:11, Andrey Ryabinin wrote:
> 
> 
> On 02/22/2018 06:33 PM, Michal Hocko wrote:
> > On Thu 22-02-18 18:13:11, Andrey Ryabinin wrote:
> >>
> >>
> >> On 02/22/2018 05:09 PM, Michal Hocko wrote:
> >>> On Thu 22-02-18 16:50:33, Andrey Ryabinin wrote:
> >>>> On 02/21/2018 11:17 PM, Andrew Morton wrote:
> >>>>> On Fri, 19 Jan 2018 16:11:18 +0100 Michal Hocko <mhocko@...nel.org> wrote:
> >>>>>
> >>>>>> And to be honest, I do not really see why keeping retrying from
> >>>>>> mem_cgroup_resize_limit should be so much faster than keep retrying from
> >>>>>> the direct reclaim path. We are doing SWAP_CLUSTER_MAX batches anyway.
> >>>>>> mem_cgroup_resize_limit loop adds _some_ overhead but I am not really
> >>>>>> sure why it should be that large.
> >>>>>
> >>>>> Maybe restarting the scan lots of times results in rescanning lots of
> >>>>> ineligible pages at the start of the list before doing useful work?
> >>>>>
> >>>>> Andrey, are you able to determine where all that CPU time is being spent?
> >>>>>
> >>>>
> >>>> I should have been more specific about the test I did. The full script looks like this:
> >>>>
> >>>> mkdir -p /sys/fs/cgroup/memory/test
> >>>> echo $$ > /sys/fs/cgroup/memory/test/tasks
> >>>> cat 4G_file > /dev/null
> >>>> while true; do cat 4G_file > /dev/null; done &
> >>>> loop_pid=$!
> >>>> perf stat echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
> >>>> echo -1 > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
> >>>> kill $loop_pid
> >>>>
> >>>>
> >>>> I think the additional loops add some overhead and it's not that big by itself, but
> >>>> this small overhead allows task to refill slightly more pages, increasing
> >>>> the total amount of pages that mem_cgroup_resize_limit() need to reclaim.
> >>>>
> >>>> By using the following commands to show the the amount of reclaimed pages:
> >>>> perf record -e vmscan:mm_vmscan_memcg_reclaim_end echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
> >>>> perf script|cut -d '=' -f 2| paste -sd+ |bc
> >>>>
> >>>> I've got 1259841 pages (4.9G) with the patch vs 1394312 pages (5.4G) without it.
> >>>
> >>> So how does the picture changes if you have multiple producers?
> >>>
> >>
> >> Drastically, in favor of the patch. But numbers *very* fickle from run to run.
> >>
> >> Inside 5G vm with  4 cpus (qemu -m 5G -smp 4) and 4 processes in cgroup reading 1G files:
> >> "while true; do cat /1g_f$i > /dev/null; done &"
> >>
> >> with the patch:
> >> best: 1.04  secs, 9.7G reclaimed
> >> worst: 2.2 secs, 16G reclaimed.
> >>
> >> without:
> >> best: 5.4 sec, 35G reclaimed
> >> worst: 22.2 sec, 136G reclaimed
> > 
> > Could you also compare how much memory do we reclaim with/without the
> > patch?
> > 
> 
> I did and I wrote the results. Please look again.

I must have forgotten. Care to point me to the message-id?
20180119132544.19569-2-aryabinin@...tuozzo.com doesn't contain this
information and a quick glance over the follow up thread doesn't have
anything as well. Ideally, this should be in the patch changelog, btw.

-- 
Michal Hocko
SUSE Labs