lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Jun 2024 16:44:07 +0800
From: Lei Liu <liulei.rjpt@...o.com>
To: Carlos Llamas <cmllamas@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Arve Hjønnevåg <arve@...roid.com>,
 Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>,
 Joel Fernandes <joel@...lfernandes.org>,
 Christian Brauner <brauner@...nel.org>,
 Suren Baghdasaryan <surenb@...gle.com>, linux-kernel@...r.kernel.org,
 opensource.kernel@...o.com
Subject: Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to
 mitigate OOM issues


On 2024/6/18 12:37, Carlos Llamas wrote:
> On Tue, Jun 18, 2024 at 10:50:17AM +0800, Lei Liu wrote:
>> On 2024/6/18 2:43, Carlos Llamas wrote:
>>> On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
>>>> On 6/15/2024 at 2:38, Carlos Llamas wrote:
>>> Yes, all this makes sense. What I don't understand is how "performance
>>> of kvcalloc is better". This is not supposed to be.
>> Based on my current understanding:
>> 1.kvmalloc may allocate memory faster than kmalloc in cases of memory
>> fragmentation, which could potentially improve the performance of binder.
> I think there is a misunderstanding of the allocations performed in this
> benchmark test. Yes, in general when there is heavy memory pressure it
> can be faster to use kvmalloc() and not try too hard to reclaim
> contiguous memory.
>
> In the case of binder though, this is the mmap() allocation. This call
> is part of the "initial setup". In the test, there should only be two
> calls to kvmalloc(), since the benchmark is done across two processes.
> That's it.
>
> So the time it takes to allocate this memory is irrelevant to the
> performance results. Does this make sense?
>
>> 2.Memory allocated by kvmalloc may not be contiguous, which could
>> potentially degrade the data read and write speed of binder.
> This _is_ what is being considered in the benchmark test instead. There
> are repeated accesses to alloc->pages[n]. Your point is then the reason
> why I was expecting "same performance at best".
>
>> Hmm, this is really good news. From the current test results, it seems that
>> kvmalloc does not degrade performance for binder.
> Yeah, not in the "happy" case anyways. I'm not sure what the numbers
> look like under some memory pressure.
>
>> I will retest the data on our phone to see if we reach the same conclusion.
>> If kvmalloc still proves to be better, we will provide you with the
>> reproduction method.
>>
> Ok, thanks. I would suggest you do an "adb shell stop" before running
> these test. This might help with the noise.
>
> Thanks,
> Carlos Llamas

We used the "adb shell stop" command to retest the data.

Now, the test data for kmalloc and vmalloc are basically consistent.

There are a few instances where vmalloc may be slightly inferior, but 
the difference is not significant, within 3%.

adb shell stop/ kmalloc /8+256G
----------------------------------------------------------------------
Benchmark                Time     CPU   Iterations  OUTPUT OUTPUTCPU
----------------------------------------------------------------------
BM_sendVec_binder4      39126    18550    38894    3.976282 8.38684
BM_sendVec_binder8      38924    18542    37786    7.766108 16.3028
BM_sendVec_binder16     38328    18228    36700    15.32039 32.2141
BM_sendVec_binder32     38154    18215    38240    32.07213 67.1798
BM_sendVec_binder64     39093    18809    36142    59.16885 122.977
BM_sendVec_binder128    40169    19188    36461    116.1843 243.2253
BM_sendVec_binder256    40695    19559    35951    226.1569 470.5484
BM_sendVec_binder512    41446    20211    34259    423.2159 867.8743
BM_sendVec_binder1024   44040    22939    28904    672.0639 1290.278
BM_sendVec_binder2048   47817    25821    26595    1139.063 2109.393
BM_sendVec_binder4096   54749    30905    22742    1701.423 3014.115
BM_sendVec_binder8192   68316    42017    16684    2000.634 3252.858
BM_sendVec_binder16384  95435    64081    10961    1881.752 2802.469
BM_sendVec_binder32768  148232  107504     6510    1439.093 1984.295
BM_sendVec_binder65536  326499  229874     3178    637.8991 906.0329
NORAML TEST                                 SUM    10355.79 17188.15
stressapptest eat 2G                        SUM    10088.39 16625.97

adb shell stop/ kvmalloc /8+256G
-----------------------------------------------------------------------
Benchmark                Time     CPU   Iterations   OUTPUT OUTPUTCPU
-----------------------------------------------------------------------
BM_sendVec_binder4       39673    18832    36598    3.689965 7.773577
BM_sendVec_binder8       39869    18969    37188    7.462038 15.68369
BM_sendVec_binder16      39774    18896    36627    14.73405 31.01355
BM_sendVec_binder32      40225    19125    36995    29.43045 61.90013
BM_sendVec_binder64      40549    19529    35148    55.47544 115.1862
BM_sendVec_binder128     41580    19892    35384    108.9262 227.6871
BM_sendVec_binder256     41584    20059    34060    209.6806 434.6857
BM_sendVec_binder512     42829    20899    32493    388.4381 796.0389
BM_sendVec_binder1024    45037    23360    29251    665.0759 1282.236
BM_sendVec_binder2048    47853    25761    27091    1159.433 2153.735
BM_sendVec_binder4096    55574    31745    22405    1651.328 2890.877
BM_sendVec_binder8192    70706    43693    16400    1900.105 3074.836
BM_sendVec_binder16384   96161    64362    10793    1838.921 2747.468
BM_sendVec_binder32768  147875   107292     6296    1395.147 1922.858
BM_sendVec_binder65536  330324   232296     3053    605.7126 861.3209
NORAML TEST                                 SUM     10033.56 16623.35
stressapptest eat 2G                        SUM      9958.43 16497.55


Can I prepare the V4 version of the patch now? Do I need to modify 
anything else in the V4 version, in addition to addressing the following 
two points?

1.Shorten the "backtrace" in the commit message.

2.Modify the code indentation to comply with the community's code style 
requirements.

thanks

Lei liu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ