lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <303e5809-041b-4270-9462-0f73a5cac062@vivo.com>
Date: Wed, 31 Jul 2024 12:56:45 +0800
From: zhiguojiang <justinjiang@...o.com>
To: Barry Song <21cnbao@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, Will Deacon <will@...nel.org>,
 "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>, Nick Piggin
 <npiggin@...il.com>, Peter Zijlstra <peterz@...radead.org>,
 Arnd Bergmann <arnd@...db.de>, Johannes Weiner <hannes@...xchg.org>,
 Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
 Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
 linux-arch@...r.kernel.org, cgroups@...r.kernel.org,
 opensource.kernel@...o.com
Subject: Re: [PATCH 0/2] mm: tlb swap entries batch async release



在 2024/7/31 10:18, Barry Song 写道:
> [Some people who received this message don't often get email from 21cnbao@...il.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> On Tue, Jul 30, 2024 at 7:44 PM Zhiguo Jiang <justinjiang@...o.com> wrote:
>> The main reasons for the prolonged exit of a background process is the
>> time-consuming release of its swap entries. The proportion of swap memory
>> occupied by the background process increases with its duration in the
>> background, and after a period of time, this value can reach 60% or more.
> Do you know the reason? Could they be contending for a cluster lock or
> something?
> Is there any perf data or flamegraph available here?
Hi,

Testing datas of application occuping different physical memory sizes at 
different time
points in the background:
Testing Platform: 8GB RAM
Testing procedure:
After booting up, start 15 applications first, and then observe the 
physical memory size
occupied by the last launched application at different time points in 
the background.

foreground - abbreviation FG
background - abbreviation BG

The app launched last: com.qiyi.video app
|  memory type  | FG 5s  | BG 5s  | BG 1min | BG 3min | BG 5min | BG 
10min | BG 15min |
---------------------------------------------------------------------------------------
|     VmRSS(KB) | 453832 | 252300 |  207724 |  206776 |  204364 | 
199944   |  199748  |
|   RssAnon(KB) | 247348 |  99296 |   71816 |   71484 |   71268 | 
67808    |   67660  |
|   RssFile(KB) | 205536 | 152020 |  134956 |  134340 |  132144 | 
131184   |  131136  |
|  RssShmem(KB) |   1048 |    984 |     952 |     952 |     952 | 
952      |     952  |
|    VmSwap(KB) | 202692 | 334852 |  362332 |  362664 |  362880 | 
366340   |  366488  |
| Swap ratio(%) | 30.87% | 57.03% |  63.56% |  63.69% |  63.97% | 
64.69%   |  64.72%  |

The app launched last: com.netease.sky.vivo
|  memory type  | FG 5s  | BG 5s  | BG 1min | BG 3min | BG 5min | BG 
10min | BG 15min |
---------------------------------------------------------------------------------------
|     VmRSS(KB) | 435424 | 403564 |  403200 |  401688 |  402996 | 
396372   |   396268 |
|   RssAnon(KB) | 151616 | 117252 |  117244 |  115888 |  117088 | 
110780   |   110684 |
|   RssFile(KB) | 281672 | 284192 |  283836 |  283680 |  283788 | 
283472   |   283464 |
|  RssShmem(KB) |   2136 |   2120 |    2120 |    2120 |    2120 | 
2120     |     2120 |
|    VmSwap(KB) | 546584 | 559920 |  559928 |  561284 |  560084 | 
566392   |   566488 |
| Swap ratio(%) | 55.66% | 58.11% |  58.14% |  58.29% |  58.16% | 
58.83%   |   58.84% |

A background exiting process's perfedata:
|      interfaces              | cost(ms) | exe(ms) | average(ms) | run 
counts |
--------------------------------------------------------------------------------
| do_signal                    |  791.813 |   0     |     791.813 |     
1      |
| get_signal                   |  791.813 |   0     |     791.813 |     
1      |
| do_group_exit                |  791.813 |   0     |     791.813 |     
1      |
| do_exit                      |  791.813 |   0.148 |     791.813 |     
1      |
| exit_mm                      |  577.859 |   0     |     577.859 |     
1      |
| __mmput                      |  577.859 |   0.202 |     577.859 |     
1      |
| exit_mmap                    |  577.497 |   1.806 |     192.499 |     
3      |
| __oom_reap_task_mm           |  562.869 |   2.695 |     562.869 |     
1      |
| unmap_page_range             |  562.07  |   3.185 |      20.817 |    
27      |
| zap_pte_range                |  558.645 | 123.958 |      15.518 |    
36      |
| free_swap_and_cache          |  433.381 |  28.831 |       6.879 |    
63      |
| free_swap_slot               |  403.568 |   4.876 |       4.248 |    
95      |
| swapcache_free_entries       |  398.292 |   3.578 |       3.588 |   
111      |
| swap_entry_free              |  393.863 |  13.953 |       3.176 |   
124      |
| swap_range_free              |  372.602 | 202.478 |       1.791 |   
208      |
| $x.204 [zram]                |  132.389 |   0.341 |       0.33 |   
401      |
| zram_reset_device            |  131.888 |  22.376 |       0.326 |   
405      |
| obj_free                     |   80.101 |  29.517 |       0.21 |   
381      |
| zs_create_pool               |   29.381 |   2.772 |       0.124 |   
237      |
| clear_shadow_from_swap_cache |   22.846 |  22.686 |       0.11 |   
208      |
| __put_page                   |   19.317 |  10.088 |       0.105 |   
184      |
| pr_memcg_info                |   13.038 |   1.181 |       0.11 |   
118      |
| free_pcp_prepare             |    9.229 |   0.812 |       0.094 |    
98      |
| xxx_memcg_out                |    9.223 |   4.746 |       0.098 |    
94      |
| free_pgtables                |    8.813 |   3.302 |       8.813 |     
1      |
| zs_compact                   |    8.617 |   8.43  |       0.097 |    
89      |
| kmem_cache_free              |    7.483 |   4.595 |       0.084 |    
89      |
| __mem_cgroup_uncharge_swap   |    6.348 |   3.03  |       0.086 |    
74      |
| $x.178 [zsmalloc]            |    6.182 |   0.32  |       0.09 |    
69      |
| $x.182 [zsmalloc]            |    5.019 |   0.08  |       0.088 |    
57      |
cost - total time consumption.
exe - total actual execution time.

According to perfdata, we can observe that the main reason for the 
prolonged exit
of a background process is the time-consuming release of its swap entries.

The reason for the time-consuming release of swap entries is not only 
due to cluster
locks, but also swp_slots lock and swap_info lock, additionally zram and 
swapdisk free
path time-consuming .
>
>> Additionally, the relatively lengthy path for releasing swap entries
>> further contributes to the longer time required for the background process
>> to release its swap entries.
>>
>> In the multiple background applications scenario, when launching a large
>> memory application such as a camera, system may enter a low memory state,
>> which will triggers the killing of multiple background processes at the
>> same time. Due to multiple exiting processes occupying multiple CPUs for
>> concurrent execution, the current foreground application's CPU resources
>> are tight and may cause issues such as lagging.
>>
>> To solve this problem, we have introduced the multiple exiting process
>> asynchronous swap memory release mechanism, which isolates and caches
>> swap entries occupied by multiple exit processes, and hands them over
>> to an asynchronous kworker to complete the release. This allows the
>> exiting processes to complete quickly and release CPU resources. We have
>> validated this modification on the products and achieved the expected
>> benefits.
>>
>> It offers several benefits:
>> 1. Alleviate the high system cpu load caused by multiple exiting
>>     processes running simultaneously.
>> 2. Reduce lock competition in swap entry free path by an asynchronous
>   Do you have data on which lock is affected? Could it be a cluster lock?
The reason for the time-consuming release of swap entries is not only 
due to cluster
locks, but also swp_slots lock and swap_info lock, additionally zram and 
swapdisk free
path time-consuming . In short, swap entry release path is relatively 
long compared to
file and anonymous folio release path.
>
>>     kworker instead of multiple exiting processes parallel execution.
>> 3. Release memory occupied by exiting processes more efficiently.
>>
>> Zhiguo Jiang (2):
>>    mm: move task_is_dying to h headfile
>>    mm: tlb: multiple exiting processes's swap entries async release
>>
>>   include/asm-generic/tlb.h |  50 +++++++
>>   include/linux/mm_types.h  |  58 ++++++++
>>   include/linux/oom.h       |   6 +
>>   mm/memcontrol.c           |   6 -
>>   mm/memory.c               |   3 +-
>>   mm/mmu_gather.c           | 297 ++++++++++++++++++++++++++++++++++++++
>>   6 files changed, 413 insertions(+), 7 deletions(-)
>>   mode change 100644 => 100755 include/asm-generic/tlb.h
>>   mode change 100644 => 100755 include/linux/mm_types.h
>>   mode change 100644 => 100755 include/linux/oom.h
>>   mode change 100644 => 100755 mm/memcontrol.c
>>   mode change 100644 => 100755 mm/memory.c
>>   mode change 100644 => 100755 mm/mmu_gather.c
> Can you check your local filesystem to determine why you're running
> the chmod command?
Ok, I will check it carefully.

Thanks
Zhiguo
>
>> --
>> 2.39.0
>>
> Thanks
> Barry


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ