linux-kernel - [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250519131151.988900-1-chenridong@huaweicloud.com>
Date: Mon, 19 May 2025 13:11:49 +0000
From: Chen Ridong <chenridong@...weicloud.com>
To: akpm@...ux-foundation.org,
	Liam.Howlett@...cle.com,
	lorenzo.stoakes@...cle.com,
	vbabka@...e.cz,
	jannh@...gle.com,
	pfalcato@...e.de,
	bigeasy@...utronix.de,
	paulmck@...nel.org,
	chenridong@...wei.com,
	roman.gushchin@...ux.dev,
	brauner@...nel.org,
	pmladek@...e.com,
	geert@...ux-m68k.org,
	mingo@...nel.org,
	rrangel@...omium.org,
	francesco@...la.it,
	kpsingh@...nel.org,
	guoweikang.kernel@...il.com,
	link@...o.com,
	viro@...iv.linux.org.uk,
	neil@...wn.name,
	nichen@...as.ac.cn,
	tglx@...utronix.de,
	frederic@...nel.org,
	peterz@...radead.org,
	oleg@...hat.com,
	joel.granados@...nel.org,
	linux@...ssschuh.net,
	avagin@...gle.com,
	legion@...nel.org
Cc: linux-kernel@...r.kernel.org,
	linux-mm@...ck.org,
	lujialin4@...wei.com
Subject: [RFC next v2 0/2] ucounts: turn the atomic rlimit to percpu_counter

From: Chen Ridong <chenridong@...wei.com>

The will-it-scale test case signal1 [1] has been observed. and the test
results reveal that the signal sending system call lacks linearity.
To further investigate this issue, we initiated a series of tests by
launching varying numbers of dockers and closely monitored the throughput
of each individual docker. The detailed test outcomes are presented as
follows:

	| Dockers     |1      |4      |8      |16     |32     |64     |
	| Throughput  |380068 |353204 |308948 |306453 |180659 |129152 |

The data clearly demonstrates a discernible trend: as the quantity of
dockers increases, the throughput per container progressively declines.
In-depth analysis has identified the root cause of this performance
degradation. The ucouts module conducts statistics on rlimit, which
involves a significant number of atomic operations. These atomic
operations, when acting on the same variable, trigger a substantial number
of cache misses or remote accesses, ultimately resulting in a drop in
performance.

This patch set addresses scalability issues in the ucounts rlimit by
replacing atomic rlimit counters with percpu_counter, which distributes
counts across CPU cores to reduce cache contention under heavy load.

Patch 1 modifies thate ucount can be freed until both the refcount and
rlimit are fully released, minimizing redundant summations. Patch 2 turns
the atomic rlimit to percpu_counter, which is suggested by Andrew.

[1] https://github.com/antonblanchard/will-it-scale/blob/master/tests/

---
v2: use percpu_counter intead of cache rlimit.

v1: https://lore.kernel.org/lkml/20250509072054.148257-1-chenridong@huaweicloud.com/

Chen Ridong (2):
  ucounts: free ucount only count and rlimit are zero
  ucounts: turn the atomic rlimit to percpu_counter

 include/linux/user_namespace.h |  17 +++-
 init/main.c                    |   1 +
 ipc/mqueue.c                   |   6 +-
 kernel/signal.c                |   8 +-
 kernel/ucount.c                | 169 +++++++++++++++++++++++----------
 mm/mlock.c                     |   5 +-
 6 files changed, 138 insertions(+), 68 deletions(-)

-- 
2.34.1