linux-kernel - Re: [PATCH v2 0/3] mm: tlb swap entries batch async release

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240731091715.b78969467c002fa3a120e034@linux-foundation.org>
Date: Wed, 31 Jul 2024 09:17:15 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Zhiguo Jiang <justinjiang@...o.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, Will Deacon
 <will@...nel.org>, "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>, Nick
 Piggin <npiggin@...il.com>, Peter Zijlstra <peterz@...radead.org>, Arnd
 Bergmann <arnd@...db.de>, Johannes Weiner <hannes@...xchg.org>, Michal
 Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
 Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
 linux-arch@...r.kernel.org, cgroups@...r.kernel.org, Barry Song
 <21cnbao@...il.com>, kernel test robot <lkp@...el.com>,
 opensource.kernel@...o.com
Subject: Re: [PATCH v2 0/3] mm: tlb swap entries batch async release

On Wed, 31 Jul 2024 21:33:14 +0800 Zhiguo Jiang <justinjiang@...o.com> wrote:

> The main reasons for the prolonged exit of a background process is the

The kernel really doesn't have a concept of a "background process". 
It's a userspace concept - perhaps "the parent process isn't waiting on
this process via wait()".

I assume here you're referring to an Android userspace concept?  I
expect that when Android "backgrounds" a process, it does lots of
things to that process.  Perhaps scheduling priority, perhaps
alteration of various MM tunables, etc.

So rather than referring to "backgrounding" it would be better to
identify what tuning alterations are made to such processes to bring
about this behavior.

> time-consuming release of its swap entries. The proportion of swap memory
> occupied by the background process increases with its duration in the
> background, and after a period of time, this value can reach 60% or more.

Again, what is it about the tuning of such processes which causes this
behavior?

> Additionally, the relatively lengthy path for releasing swap entries
> further contributes to the longer time required for the background process
> to release its swap entries.
> 
> In the multiple background applications scenario, when launching a large
> memory application such as a camera, system may enter a low memory state,
> which will triggers the killing of multiple background processes at the
> same time. Due to multiple exiting processes occupying multiple CPUs for
> concurrent execution, the current foreground application's CPU resources
> are tight and may cause issues such as lagging.
> 
> To solve this problem, we have introduced the multiple exiting process
> asynchronous swap memory release mechanism, which isolates and caches
> swap entries occupied by multiple exit processes, and hands them over
> to an asynchronous kworker to complete the release. This allows the
> exiting processes to complete quickly and release CPU resources. We have
> validated this modification on the products and achieved the expected
> benefits.

Dumb question: why can't this be done in userspace?  The exiting
process does fork/exit and lets the child do all this asynchronous freeing?

> It offers several benefits:
> 1. Alleviate the high system cpu load caused by multiple exiting
>    processes running simultaneously.
> 2. Reduce lock competition in swap entry free path by an asynchronous
>    kworker instead of multiple exiting processes parallel execution.

Why is lock contention reduced?  The same amount of work needs to be
done.

> 3. Release memory occupied by exiting processes more efficiently.

Probably it's slightly less efficient.

There are potential problems with this approach of passing work to a
kernel thread:

- The process will exit while its resources are still allocated.  But
  its parent process assumes those resources are now all freed and the
  parent process then proceeds to allocate resources.  This results in
  a time period where peak resource consumption is higher than it was
  before such a change.

- If all CPUs are running in userspace with realtime policy
  (SCHED_FIFO, for example) then the kworker thread will not run,
  indefinitely.

- Work which should have been accounted to the exiting process will
  instead go unaccounted.  

So please fully address all these potential issues.