lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250210210541.867037-1-mjguzik@gmail.com>
Date: Mon, 10 Feb 2025 22:05:41 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: kees@...nel.org,
	luto@...capital.net,
	wad@...omium.org
Cc: linux-kernel@...r.kernel.org,
	Mateusz Guzik <mjguzik@...il.com>
Subject: [PATCH] seccomp: avoid the lock trip in seccomp_filter_release in common case

Vast majority of threads don't have any seccomp filters, all while the
lock taken here is shared between all threads in given process and
frequently used.

Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
---

Here is a splat from parallel thread creation/destruction within onep
rocess:

bpftrace -e 'kprobe:__pv_queued_spin_lock_slowpath { @[kstack()] = count(); }'

[snip]
@[
    __pv_queued_spin_lock_slowpath+5
    _raw_spin_lock_irq+42
    seccomp_filter_release+32
    do_exit+286
    __x64_sys_exit+27
    x64_sys_call+4703
    do_syscall_64+82
    entry_SYSCALL_64_after_hwframe+118
]: 475601
@[
    __pv_queued_spin_lock_slowpath+5
    _raw_spin_lock_irq+42
    acct_collect+77
    do_exit+1380
    __x64_sys_exit+27
    x64_sys_call+4703
    do_syscall_64+82
    entry_SYSCALL_64_after_hwframe+118
]: 478335
@[
    __pv_queued_spin_lock_slowpath+5
    _raw_spin_lock_irq+42
    sigprocmask+106
    __x64_sys_rt_sigprocmask+121
    do_syscall_64+82
    entry_SYSCALL_64_after_hwframe+118
]: 1825572

There are more spots which take the same lock, with seccomp being top 3.

I could not be bothered to bench before/after, but I can do it if you
insist. The fact that this codepath is a factor can be seen above.

This is a minor patch, I'm not going to insist on it.

To my reading seccomp only ever gets populated for current, so this
should be perfectly safe to test on exit without any synchronisation.

This may need a data_race annotation if some tooling decides to protest.

 kernel/seccomp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 7bbb408431eb..c839674966e2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -576,6 +576,9 @@ void seccomp_filter_release(struct task_struct *tsk)
 	if (WARN_ON((tsk->flags & PF_EXITING) == 0))
 		return;
 
+	if (tsk->seccomp.filter == NULL)
+		return;
+
 	spin_lock_irq(&tsk->sighand->siglock);
 	orig = tsk->seccomp.filter;
 	/* Detach task from its filter tree. */
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ