linux-kernel - Re: [External] Re: [PATCH v2] mm/vmscan: fix infinite loop in drop_slab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKRVAeOQyx4+OjvQa2UV2cy3E6bB5epznjHudOJYqe8NC=Uqkg@mail.gmail.com>
Date:   Tue, 15 Sep 2020 12:23:59 +0800
From:   Chunxin Zang <zangchunxin@...edance.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Muchun Song <songmuchun@...edance.com>
Subject: Re: [External] Re: [PATCH v2] mm/vmscan: fix infinite loop in drop_slab_node

On Mon, Sep 14, 2020 at 11:17 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 14-09-20 23:02:15, Chunxin Zang wrote:
> > On Mon, Sep 14, 2020 at 9:47 PM Michal Hocko <mhocko@...e.com> wrote:
> >
> > > On Mon 14-09-20 21:25:59, Chunxin Zang wrote:
> > > > On Mon, Sep 14, 2020 at 5:30 PM Michal Hocko <mhocko@...e.com> wrote:
> > > >
> > > > > The subject is misleading because this patch doesn't fix an infinite
> > > > > loop, right? It just allows the userspace to interrupt the operation.
> > > > >
> > > > >
> > > > Yes,  so we are making a separate patch follow Vlastimil's
> > > recommendations.
> > > > Use double of threshold to end the loop.
> > >
> > > That still means the changelog needs an update
> > >
> >
> > The patch is already merged in Linux-next branch.  Can I update the
> > changelog now?
>
> Yes. Andrew will refresh it. He doesn't have a git tree which would
> prevent rewriting the patch.
>
> > This is my first patch, please forgive me :)
>
> No worries. The mm patch workflow is rather different from others.
>
> > > > On Thu, Sep 10, 2020 at 1:59 AM Vlastimil Babka <vbabka@...e.cz> wrote:
> > > > > > From: Chunxin Zang <zangchunxin@...edance.com>
> > > > > >
> > > > > ...
> > > > > - IMHO it's still worth to bail out in your scenario even without a
> > > > > signal, e.g.
> > > > > by the doubling of threshold. But it can be a separate patch.
> > > > > Thanks!
> > > > > ...
> > > >
> > > >
> > > >
> > > > On Wed 09-09-20 23:20:47, zangchunxin@...edance.com wrote:
> > > > > > From: Chunxin Zang <zangchunxin@...edance.com>
> > > > > >
> > > > > > On our server, there are about 10k memcg in one machine. They use
> > > memory
> > > > > > very frequently. When I tigger drop caches，the process will infinite
> > > loop
> > > > > > in drop_slab_node.
> > > > >
> > > > > Is this really an infinite loop, or it just takes a lot of time to
> > > > > process all the metadata in that setup? If this is really an infinite
> > > > > loop then we should look at it. My current understanding is that the
> > > > > operation would finish at some time it just takes painfully long to get
> > > > > there.
> > > > >
> > > >
> > > > Yes,  it's really an infinite loop.  Every loop spends a lot of time. In
> > > > this time,
> > > > memcg will alloc/free memory,  so the next loop, the total of  'freed'
> > > > always bigger than 10.
> > >
> > > I am still not sure I follow. Do you mean that there is somebody
> > > constantly generating more objects to reclaim?
> > >
> >
> > Yes, this is my meaning. :)
> >
> >
> > > Maybe we are just not agreeing on the definition of an infinite loop but
> > > in my book that means that the final condition can never be met. While a
> > > busy adding new object might indeed cause drop caches to loop for a long
> > > time this is to be expected from that interface as it is supposed to
> > > drop all the cache and that can grow during the operation.
> > > --
> > >
> >
> > Because I have 10k memcg , all of them are heavy users of memory.
> > During each loop, there are always more than 10 reclaimable objects
> > generating, so the
> > condition is never met.
>
> 10k or any number of memcgs shouldn't really make much of a difference.
> Except for the time the scan adds. Fundamentally we are talking about
> freed objects and whether they are on the global or per memcg lists
> should result in a similar behavior.
>
> > The drop cache process has no chance to exit the
> > loop.
> > Although the purpose of the 'drop cache' interface is to release all
> > caches, we still need a
> > way to terminate it, e.g. in this case, the process took too long to run .
>
> Yes, this is perfectly understandable. Having a bail out on fatal signal
> is a completely reasonable thing to do. I am mostly confused by your
> infinite loop claims and what the relation of this patch to it. I would
> propose this wording instead
>
> "
> We have observed that drop_caches can take a considerable amount of
> time (<put data here>). Especially when there are many memcgs involved
> because they are adding an additional overhead.
>
> It is quite unfortunate that the operation cannot be interrupted by a
> signal currently. Add a check for fatal signals into the main loop
> so that userspace can control early bailout.
> "

I understand what you mean. Yes, you are right, The patch only
provides a way for
users to terminate the process, not to reduce time consumption, not improve
the situation about loops.
I will update the description of the patch in v3.
Thanks so much :)

> or something along those lines.
>
> >
> >   root  357956 ... R    Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches
>
> --
> Michal Hocko
> SUSE Labs

Best wishes
Chunxin