[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190322174753.GA106077@google.com>
Date: Fri, 22 Mar 2019 13:47:53 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: Uladzislau Rezki <urezki@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>,
Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
LKML <linux-kernel@...r.kernel.org>,
Thomas Garnier <thgarnie@...gle.com>,
Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Tejun Heo <tj@...nel.org>
Subject: Re: [RFC PATCH v2 0/1] improve vmap allocation
On Fri, Mar 22, 2019 at 05:52:59PM +0100, Uladzislau Rezki wrote:
> On Thu, Mar 21, 2019 at 03:01:06PM -0700, Andrew Morton wrote:
> > On Thu, 21 Mar 2019 20:03:26 +0100 "Uladzislau Rezki (Sony)" <urezki@...il.com> wrote:
> >
> > > Hello.
> > >
> > > This is the v2 of the https://lkml.org/lkml/2018/10/19/786 rework. Instead of
> > > referring you to that link, i will go through it again describing the improved
> > > allocation method and provide changes between v1 and v2 in the end.
> > >
> > > ...
> > >
> >
> > > Performance analysis
> > > --------------------
> >
> > Impressive numbers. But this is presumably a worst-case microbenchmark.
> >
> > Are you able to describe the benefits which are observed in some
> > real-world workload which someone cares about?
> >
> We work with Android. Google uses its own tool called UiBench to measure
> performance of UI. It counts dropped or delayed frames, or as they call it,
> jank. Basically if we deliver 59(should be 60) frames per second then we
> get 1 junk/drop.
Agreed. Strictly speaking, "1 Jank" is not necessarily "1 frame drop". A
delayed frame is also a Jank. Just because a frame is delayed does not mean
it is dropped, there is double buffering etc to absorb delays.
> I see that on our devices avg-jank is lower. In our case Android graphics
> pipeline uses vmalloc allocations which can lead to delays of UI content
> to GPU. But such behavior depends on your platform, parts of the system
> which make use of it and if they are critical to time or not.
>
> Second example is indirect impact. During analysis of audio glitches
> in high-resolution audio the source of drops were long alloc_vmap_area()
> allocations.
>
> # Explanation is here
> ftp://vps418301.ovh.net/incoming/analysis_audio_glitches.txt
>
> # Audio 10 seconds sample is here.
> # The drop occurs at 00:09.295 you can hear it
> ftp://vps418301.ovh.net/incoming/tst_440_HZ_tmp_1.wav
Nice.
> > It's a lot of new code. I t looks decent and I'll toss it in there for
> > further testing. Hopefully someone will be able to find the time for a
> > detailed review.
> >
> Thank you :)
I can try to do a review fwiw. But I am severely buried right now. I did look
at vmalloc code before for similar reasons (preempt off related delays
causing jank / glitches etc). Any case, I'll take another look soon (in next
1-2 weeks).
> > Trivial point: the code uses "inline" a lot. Nowadays gcc cheerfully
> > ignores that and does its own thing. You might want to look at the
> > effects of simply deleting all that. Is the generated code better or
> > worse or the same? If something really needs to be inlined then use
> > __always_inline, preferably with a comment explaining why it is there.
> >
> When the main core functionalities are "inlined" i see the benefit.
> At least, it is noticeable by the "test driver". But i agree that
> i should check one more time to see what can be excluded and used
> as a regular call. Thanks for the hint, it is worth to go with
> __always_inline instead.
I wonder how clang behaves as far as inline hints go. That is how Android
images build their kernels.
thanks,
- Joel
Powered by blists - more mailing lists