lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251114201329.3275875-1-ameryhung@gmail.com>
Date: Fri, 14 Nov 2025 12:13:22 -0800
From: Amery Hung <ameryhung@...il.com>
To: bpf@...r.kernel.org
Cc: netdev@...r.kernel.org,
	alexei.starovoitov@...il.com,
	andrii@...nel.org,
	daniel@...earbox.net,
	martin.lau@...nel.org,
	memxor@...il.com,
	kpsingh@...nel.org,
	yonghong.song@...ux.dev,
	song@...nel.org,
	ameryhung@...il.com,
	kernel-team@...a.com
Subject: [PATCH v2 bpf-next 0/4] Replace BPF memory allocator with kmalloc_nolock() in local storage

Hi,

This patchset tries to simplify bpf_local_storage.c by adopting
kmalloc_nolock(). This removes memory preallocation and reduces the
dependency of smap in bpf_selem_free() and bpf_local_storage_free().
The later will simplify a future refactor that replaces
local_storage->lock and b->lock [1].

RFC v1 tried to switch to kmalloc_nolock() unconditionally. However,
as there is substantial performance loss in socket local storage due to
1) defer_free() in kfree_nolock() and 2) no kfree_rcu() batching,
replacing kzalloc() is postponed until necessary improvements in mm
land.

Benchmark

./bench -p 1 local-storage-create --storage-type <socket,task> \
  --batch-size <16,32,64>

The benchmark is a microbenchmark stress-testing how fast local storage
can be created. For task local storage, switching from BPF memory
allocator to kmalloc_nolock() yields a small amount of improvement. For
socket local storage, it remains roughly the same as nothing has changed.

Socket local storage
memory alloc     batch  creation speed              creation speed diff
---------------  ----   ------------------                         ----
kzalloc           16    144.149 ± 0.642k/s  3.10 kmallocs/create
(before)          32    144.379 ± 1.070k/s  3.08 kmallocs/create
                  64    144.491 ± 0.818k/s  3.13 kmallocs/create
                  
kzalloc           16    146.180 ± 1.403k/s  3.10 kmallocs/create  +1.4%
(not changed)     32    146.245 ± 1.272k/s  3.10 kmallocs/create  +1.3%
                  64    145.012 ± 1.545k/s  3.10 kmallocs/create  +0.4%
                   
Task local storage
memory alloc     batch  creation speed              creation speed diff
---------------  ----   ------------------                         ----
BPF memory        16     24.668 ± 0.121k/s  2.54 kmallocs/create
allocator         32     22.899 ± 0.097k/s  2.67 kmallocs/create
(before)          64     22.559 ± 0.076k/s  2.56 kmallocs/create
                  
kmalloc_nolock    16     25.796 ± 0.059k/s  2.52 kmallocs/create  +4.6%
(after)           32     23.412 ± 0.069k/s  2.50 kmallocs/create  +2.2%
                  64     23.717 ± 0.108k/s  2.60 kmallocs/create  +5.1%


[1] https://lore.kernel.org/bpf/20251002225356.1505480-1-ameryhung@gmail.com/


v1 -> v2
  - Only replace BPF memory allocator with kmalloc_nolock()
  Link: https://lore.kernel.org/bpf/20251112175939.2365295-1-ameryhung@gmail.com/

---

Amery Hung (4):
  bpf: Always charge/uncharge memory when allocating/unlinking storage
    elements
  bpf: Remove smap argument from bpf_selem_free()
  bpf: Save memory alloction info in bpf_local_storage
  bpf: Replace bpf memory allocator with kmalloc_nolock() in local
    storage

 include/linux/bpf_local_storage.h |  10 +-
 kernel/bpf/bpf_local_storage.c    | 235 +++++++++---------------------
 net/core/bpf_sk_storage.c         |   4 +-
 3 files changed, 74 insertions(+), 175 deletions(-)

-- 
2.47.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ