[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201117130434.GA10769@pc636>
Date: Tue, 17 Nov 2020 14:04:34 +0100
From: Uladzislau Rezki <urezki@...il.com>
To: huang ying <huang.ying.caritas@...il.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>,
Hillf Danton <hdanton@...a.com>,
Michal Hocko <mhocko@...e.com>,
Matthew Wilcox <willy@...radead.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH 2/2] mm/vmalloc: rework the drain logic
On Tue, Nov 17, 2020 at 10:37:34AM +0800, huang ying wrote:
> On Tue, Nov 17, 2020 at 6:00 AM Uladzislau Rezki (Sony)
> <urezki@...il.com> wrote:
> >
> > A current "lazy drain" model suffers from at least two issues.
> >
> > First one is related to the unsorted list of vmap areas, thus
> > in order to identify the [min:max] range of areas to be drained,
> > it requires a full list scan. What is a time consuming if the
> > list is too long.
> >
> > Second one and as a next step is about merging all fragments
> > with a free space. What is also a time consuming because it
> > has to iterate over entire list which holds outstanding lazy
> > areas.
> >
> > See below the "preemptirqsoff" tracer that illustrates a high
> > latency. It is ~24 676us. Our workloads like audio and video
> > are effected by such long latency:
>
> This seems like a real problem. But I found there's long latency
> avoidance mechanism in the loop in __purge_vmap_area_lazy() as
> follows,
>
> if (atomic_long_read(&vmap_lazy_nr) < resched_threshold)
> cond_resched_lock(&free_vmap_area_lock);
>
I have added that "resched threshold" because of on my tests i could
simply hit out of memory, due to the fact that a drain work is not up
to speed to process such long outstanding list of vmap areas.
>
> If it works properly, the latency problem can be solved. Can you
> check whether this doesn't work for you?
>
We have that cond_resched_lock() in our products. The patch that is
in question creates bigger vmap areas on early step(merge them), so
the final structure becomes less fragmented, what speeds up a drain
logic, thus reduces a preemption off time.
Apart of that, high priority tasks like RT or DL which are users of
the vmalloc()/vfree() can start draining process from its contexts,
what is also a problem. In that sense, i think we need to make the
vfree() call to be asynchronous, so latency sensitive tasks and others
do not perform any draining from their contexts.
--
Vlad Rezki
Powered by blists - more mailing lists