linux-kernel - [PATCH RFC 4/6] mm, vma: use sheaves for vm_area

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20241112-slub-percpu-caches-v1-4-ddc0bdc27e05@suse.cz>
Date: Tue, 12 Nov 2024 17:38:48 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Suren Baghdasaryan <surenb@...gle.com>, 
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
 Christoph Lameter <cl@...ux.com>, David Rientjes <rientjes@...gle.com>, 
 Pekka Enberg <penberg@...nel.org>, Joonsoo Kim <iamjoonsoo.kim@....com>
Cc: Roman Gushchin <roman.gushchin@...ux.dev>, 
 Hyeonggon Yoo <42.hyeyoo@...il.com>, 
 "Paul E. McKenney" <paulmck@...nel.org>, 
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, 
 Matthew Wilcox <willy@...radead.org>, Boqun Feng <boqun.feng@...il.com>, 
 Uladzislau Rezki <urezki@...il.com>, linux-mm@...ck.org, 
 linux-kernel@...r.kernel.org, rcu@...r.kernel.org, 
 maple-tree@...ts.infradead.org, Vlastimil Babka <vbabka@...e.cz>
Subject: [PATCH RFC 4/6] mm, vma: use sheaves for vm_area_struct cache

Create the vm_area_struct cache with percpu sheaves of size 32 to
hopefully improve its performance. For CONFIG_PER_VMA_LOCK, change the
vma freeing from custom call_rcu() callback to kfree_rcu() which
will perform rcu_free sheaf batching. Since there may be additional
structures attached and they are freed only after the grace period,
create a __vma_area_rcu_free_dtor() to do that.

Note I have not investigated whether vma_numab_state_free() or
free_anon_vma_name() must really need to wait for the grace period. For
vma_lock_free() ideally we wouldn't free it at all when freeing the vma
to the sheaf (or even slab page), but that would require using also a
ctor for vmas to allocate the vma lock, and reintroducing dtor support
for deallocating the lock when freeing slab pages containing the vmas.

The plan is to move vma_lock into vma itself anyway, so if the rest can
be freed immediately, the whole destructor support won't be needed
anymore.

Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
---
 kernel/fork.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 22f43721d031d48fd5be2606e86642334be9735f..9b1ae5aaf6a58fded6c9ac378809296825eba9fa 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -516,22 +516,24 @@ void __vm_area_free(struct vm_area_struct *vma)
 	kmem_cache_free(vm_area_cachep, vma);
 }

-#ifdef CONFIG_PER_VMA_LOCK
-static void vm_area_free_rcu_cb(struct rcu_head *head)
+static void __vma_area_rcu_free_dtor(void *ptr)
 {
-	struct vm_area_struct *vma = container_of(head, struct vm_area_struct,
-						  vm_rcu);
+	struct vm_area_struct *vma = ptr;

 	/* The vma should not be locked while being destroyed. */
+#ifdef CONFIG_PER_VMA_LOCK
 	VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock), vma);
-	__vm_area_free(vma);
-}
 #endif

+	vma_numab_state_free(vma);
+	free_anon_vma_name(vma);
+	vma_lock_free(vma);
+}
+
 void vm_area_free(struct vm_area_struct *vma)
 {
 #ifdef CONFIG_PER_VMA_LOCK
-	call_rcu(&vma->vm_rcu, vm_area_free_rcu_cb);
+	kfree_rcu(vma, vm_rcu);
 #else
 	__vm_area_free(vma);
 #endif
@@ -3155,6 +3157,12 @@ void __init mm_cache_init(void)

 void __init proc_caches_init(void)
 {
+	struct kmem_cache_args vm_args = {
+		.align = __alignof__(struct vm_area_struct),
+		.sheaf_capacity = 32,
+		.sheaf_rcu_dtor = __vma_area_rcu_free_dtor,
+	};
+
 	sighand_cachep = kmem_cache_create("sighand_cache",
 			sizeof(struct sighand_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU|
@@ -3172,7 +3180,10 @@ void __init proc_caches_init(void)
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
 			NULL);

-	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
+	vm_area_cachep = kmem_cache_create("vm_area_struct",
+				sizeof(struct vm_area_struct), &vm_args,
+				SLAB_PANIC|SLAB_ACCOUNT);
+
 #ifdef CONFIG_PER_VMA_LOCK
 	vma_lock_cachep = KMEM_CACHE(vma_lock, SLAB_PANIC|SLAB_ACCOUNT);
 #endif

-- 
2.47.0