lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241106-transparent-athletic-ammonite-586af8@leitao>
Date: Wed, 6 Nov 2024 04:03:34 -0800
From: Breno Leitao <leitao@...ian.org>
To: Andrii Nakryiko <andrii@...nel.org>
Cc: linux-trace-kernel@...r.kernel.org, peterz@...radead.org,
	oleg@...hat.com, rostedt@...dmis.org, mhiramat@...nel.org,
	bpf@...r.kernel.org, linux-kernel@...r.kernel.org, jolsa@...nel.org,
	paulmck@...nel.org, willy@...radead.org, surenb@...gle.com,
	akpm@...ux-foundation.org, linux-mm@...ck.org
Subject: Re: [PATCH v5 4/8] uprobes: travers uprobe's consumer list
 locklessly under SRCU protection

Hello Andrii,

On Tue, Sep 03, 2024 at 10:45:59AM -0700, Andrii Nakryiko wrote:
> uprobe->register_rwsem is one of a few big bottlenecks to scalability of
> uprobes, so we need to get rid of it to improve uprobe performance and
> multi-CPU scalability.
> 
> First, we turn uprobe's consumer list to a typical doubly-linked list
> and utilize existing RCU-aware helpers for traversing such lists, as
> well as adding and removing elements from it.
> 
> For entry uprobes we already have SRCU protection active since before
> uprobe lookup. For uretprobe we keep refcount, guaranteeing that uprobe
> won't go away from under us, but we add SRCU protection around consumer
> list traversal.

I am seeing the following message in a kernel with RCU_PROVE_LOCKING:

	kernel/events/uprobes.c:937 RCU-list traversed without holding the required lock!!

It seems the SRCU is not held, when coming from mmap_region ->
uprobe_mmap. Here is the message I got in my debug kernel. (sorry for
not decoding it, but, the stack trace is clear enough).

         WARNING: suspicious RCU usage
           6.12.0-rc5-kbuilder-01152-gc688a96c432e #26 Tainted: G        W   E    N
           -----------------------------
           kernel/events/uprobes.c:938 RCU-list traversed without holding the required lock!!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
           3 locks held by env/441330:
            #0: ffff00021c1bc508 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x84/0x1d0
            #1: ffff800089f3ab48 (&uprobes_mmap_mutex[i]){+.+.}-{3:3}, at: uprobe_mmap+0x20c/0x548
            #2: ffff0004e564c528 (&uprobe->consumer_rwsem){++++}-{3:3}, at: filter_chain+0x30/0xe8

stack backtrace:
           CPU: 4 UID: 34133 PID: 441330 Comm: env Kdump: loaded Tainted: G        W   E    N 6.12.0-rc5-kbuilder-01152-gc688a96c432e #26
           Tainted: [W]=WARN, [E]=UNSIGNED_MODULE, [N]=TEST
           Hardware name: Quanta S7GM 20S7GCU0010/S7G MB (CG1), BIOS 3D22 07/03/2024
           Call trace:
            dump_backtrace+0x10c/0x198
            show_stack+0x24/0x38
            __dump_stack+0x28/0x38
            dump_stack_lvl+0x74/0xa8
            dump_stack+0x18/0x28
            lockdep_rcu_suspicious+0x178/0x2c8
            filter_chain+0xdc/0xe8
            uprobe_mmap+0x2e0/0x548
            mmap_region+0x510/0x988
            do_mmap+0x444/0x528
            vm_mmap_pgoff+0xf8/0x1d0
            ksys_mmap_pgoff+0x184/0x2d8


That said, it seems we want to hold the SRCU, before reaching the
filter_chain(). I hacked a bit, and adding the lock in uprobe_mmap()
solves the problem, but, I might be missing something, since I am not familiar
with this code.

How does the following patch look like?

commit 1bd7bcf03031ceca86fdddd8be2e5500497db29f
Author: Breno Leitao <leitao@...ian.org>
Date:   Mon Nov 4 06:53:31 2024 -0800

    uprobes: Get SRCU lock before traverseing the list

    list_for_each_entry_srcu() is being called without holding the lock,
    which causes LOCKDEP (when enabled with RCU_PROVING) to complain such
    as:

            kernel/events/uprobes.c:937 RCU-list traversed without holding the required lock!!

    Get the SRCU uprobes_srcu lock before calling filter_chain(), which
    needs to have the SRCU lock hold, since it is going to call
    list_for_each_entry_srcu().

    Signed-off-by: Breno Leitao <leitao@...ian.org>
    Fixes: cc01bd044e6a ("uprobes: travers uprobe's consumer list locklessly under SRCU protection")

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 4b52cb2ae6d62..cc9d4ddeea9a6 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1391,6 +1391,7 @@ int uprobe_mmap(struct vm_area_struct *vma)
 	struct list_head tmp_list;
 	struct uprobe *uprobe, *u;
 	struct inode *inode;
+	int srcu_idx;

 	if (no_uprobe_events())
 		return 0;
@@ -1409,6 +1410,7 @@ int uprobe_mmap(struct vm_area_struct *vma)

 	mutex_lock(uprobes_mmap_hash(inode));
 	build_probe_list(inode, vma, vma->vm_start, vma->vm_end, &tmp_list);
+	srcu_idx = srcu_read_lock(&uprobes_srcu);
 	/*
 	 * We can race with uprobe_unregister(), this uprobe can be already
 	 * removed. But in this case filter_chain() must return false, all
@@ -1422,6 +1424,7 @@ int uprobe_mmap(struct vm_area_struct *vma)
 		}
 		put_uprobe(uprobe);
 	}
+	srcu_read_unlock(&uprobes_srcu, srcu_idx);
 	mutex_unlock(uprobes_mmap_hash(inode));

 	return 0;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ