lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 7 Mar 2024 15:40:29 +0800
From: Bixuan Cui <cuibixuan@...o.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Steven Rostedt <rostedt@...dmis.org>, akpm@...ux-foundation.org,
 mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 linux-mm@...ck.org, opensource.kernel@...o.com
Subject: Re: [PATCH -next v6 0/2] Make memory reclamation measurable



在 2024/2/21 15:44, Michal Hocko 写道:
> It would be really helpful to have more details on why we need those 
> trace points. It is my understanding that you would like to have a more 
> fine grained numbers for the time duration of different parts of the 
> reclaim process. I can imagine this could be useful in some cases but is 
> it useful enough and for a wider variety of workloads? Is that worth a 
> dedicated static tracepoints? Why an add-hoc dynamic tracepoints or BPF 
> for a very special situation is not sufficient? In other words, tell us 
> more about the usecases and why is this generally useful.
Thank you for your reply, I'm sorry that I forgot to describe the 
detailed reason.

Memory reclamation usually occurs when there is high memory pressure (or 
low memory) and is performed by Kswapd. In embedded systems, CPU 
resources are limited, and it is common for kswapd and critical 
processes (which typically require a large amount of memory and trigger 
memory reclamation) to compete for CPU resources. which in turn affects 
the execution of this key process, causing the execution time to 
increase and causing lags,such as dropped frames or slower startup times 
in mobile games.
Currently, with the help of kernel trace events or tools like Perfetto, 
we can only see that kswapd is competing for CPU and the frequency of 
memory reclamation triggers, but we do not have detailed information or 
metrics about memory reclamation, such as the duration and amount of 
each reclamation, or who is releasing memory (super_cache, f2fs, ext4), 
etc. This makes it impossible to locate the above problems.

Currently this patch helps us solve 2 actual performance problems 
(kswapd preempts the CPU causing game delay)
1. The increased memory allocation in the game (across different 
versions) has led to the degradation of kswapd.
     This is found by calculating the total amount of Reclaim(page) 
during the game startup phase.

2. The adoption of a different file system in the new system version has 
resulted in a slower reclamation rate.
     This is discovered through the OBJ_NAME change. For example, 
OBJ_NAME changes from super_cache_scan to ext4_es_scan.

Subsequently, it is also possible to calculate the memory reclamation 
rate to evaluate the memory performance of different versions.



The main reasons for adding static tracepoints are:
1. To subdivide the time spent in the shrinker->count_objects() and 
shrinker->scan_objects() functions within the do_shrink_slab function. 
Using BPF kprobe, we can only track the time spent in the do_shrink_slab 
function.
2. When tracing frequently called functions, static tracepoints (BPF 
tp/tracepoint) have lower performance impact compared to dynamic 
tracepoints (BPF kprobe).

Thanks
Bixuan Cui

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ