[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c425a99-2b51-49a6-a3e5-1f2ef8b5254f@vivo.com>
Date: Wed, 19 Jun 2024 16:35:42 +0800
From: Lei Liu <liulei.rjpt@...o.com>
To: Carlos Llamas <cmllamas@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Arve Hjønnevåg <arve@...roid.com>,
Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>,
Joel Fernandes <joel@...lfernandes.org>,
Christian Brauner <brauner@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, linux-kernel@...r.kernel.org,
opensource.kernel@...o.com
Subject: Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to
mitigate OOM issues
On 2024/6/18 12:37, Carlos Llamas wrote:
> On Tue, Jun 18, 2024 at 10:50:17AM +0800, Lei Liu wrote:
>> On 2024/6/18 2:43, Carlos Llamas wrote:
>>> On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
>>>> On 6/15/2024 at 2:38, Carlos Llamas wrote:
>>> Yes, all this makes sense. What I don't understand is how "performance
>>> of kvcalloc is better". This is not supposed to be.
>> Based on my current understanding:
>> 1.kvmalloc may allocate memory faster than kmalloc in cases of memory
>> fragmentation, which could potentially improve the performance of binder.
> I think there is a misunderstanding of the allocations performed in this
> benchmark test. Yes, in general when there is heavy memory pressure it
> can be faster to use kvmalloc() and not try too hard to reclaim
> contiguous memory.
>
> In the case of binder though, this is the mmap() allocation. This call
> is part of the "initial setup". In the test, there should only be two
> calls to kvmalloc(), since the benchmark is done across two processes.
> That's it.
>
> So the time it takes to allocate this memory is irrelevant to the
> performance results. Does this make sense?
>
>> 2.Memory allocated by kvmalloc may not be contiguous, which could
>> potentially degrade the data read and write speed of binder.
> This _is_ what is being considered in the benchmark test instead. There
> are repeated accesses to alloc->pages[n]. Your point is then the reason
> why I was expecting "same performance at best".
>
>> Hmm, this is really good news. From the current test results, it seems that
>> kvmalloc does not degrade performance for binder.
> Yeah, not in the "happy" case anyways. I'm not sure what the numbers
> look like under some memory pressure.
>
>> I will retest the data on our phone to see if we reach the same conclusion.
>> If kvmalloc still proves to be better, we will provide you with the
>> reproduction method.
>>
> Ok, thanks. I would suggest you do an "adb shell stop" before running
> these test. This might help with the noise.
>
> Thanks,
> Carlos Llamas
We used the "adb shell stop" command to retest the data. Now, the test
data for kmalloc and vmalloc are basically consistent. There are a few
instances where vmalloc may be slightly inferior, but the difference is
not significant, within 3%. adb shell stop/ kmalloc /8+256G
----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
----------------------------------------------------------------------
BM_sendVec_binder4 39126 18550 38894 3.976282 8.38684 BM_sendVec_binder8
38924 18542 37786 7.766108 16.3028 BM_sendVec_binder16 38328 18228 36700
15.32039 32.2141 BM_sendVec_binder32 38154 18215 38240 32.07213 67.1798
BM_sendVec_binder64 39093 18809 36142 59.16885 122.977
BM_sendVec_binder128 40169 19188 36461 116.1843 243.2253
BM_sendVec_binder256 40695 19559 35951 226.1569 470.5484
BM_sendVec_binder512 41446 20211 34259 423.2159 867.8743
BM_sendVec_binder1024 44040 22939 28904 672.0639 1290.278
BM_sendVec_binder2048 47817 25821 26595 1139.063 2109.393
BM_sendVec_binder4096 54749 30905 22742 1701.423 3014.115
BM_sendVec_binder8192 68316 42017 16684 2000.634 3252.858
BM_sendVec_binder16384 95435 64081 10961 1881.752 2802.469
BM_sendVec_binder32768 148232 107504 6510 1439.093 1984.295
BM_sendVec_binder65536 326499 229874 3178 637.8991 906.0329 NORAML TEST
SUM 10355.79 17188.15 stressapptest eat 2G SUM 10088.39 16625.97 adb
shell stop/ kvmalloc /8+256G
-----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
-----------------------------------------------------------------------
BM_sendVec_binder4 39673 18832 36598 3.689965 7.773577
BM_sendVec_binder8 39869 18969 37188 7.462038 15.68369
BM_sendVec_binder16 39774 18896 36627 14.73405 31.01355
BM_sendVec_binder32 40225 19125 36995 29.43045 61.90013
BM_sendVec_binder64 40549 19529 35148 55.47544 115.1862
BM_sendVec_binder128 41580 19892 35384 108.9262 227.6871
BM_sendVec_binder256 41584 20059 34060 209.6806 434.6857
BM_sendVec_binder512 42829 20899 32493 388.4381 796.0389
BM_sendVec_binder1024 45037 23360 29251 665.0759 1282.236
BM_sendVec_binder2048 47853 25761 27091 1159.433 2153.735
BM_sendVec_binder4096 55574 31745 22405 1651.328 2890.877
BM_sendVec_binder8192 70706 43693 16400 1900.105 3074.836
BM_sendVec_binder16384 96161 64362 10793 1838.921 2747.468
BM_sendVec_binder32768 147875 107292 6296 1395.147 1922.858
BM_sendVec_binder65536 330324 232296 3053 605.7126 861.3209 NORAML TEST
SUM 10033.56 16623.35 stressapptest eat 2G SUM 9958.43 16497.55 Can I
prepare the V4 version of the patch now? Do I need to modify anything
else in the V4 version, in addition to addressing the following two
points? 1.Shorten the "backtrace" in the commit message. 2.Modify the
code indentation to comply with the community's code style requirements.
Thanks,
Lei Liu
Powered by blists - more mailing lists