linux-kernel - Re: [PATCH 1/2] uprobes: document the usage of mm->mmap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240711090704.556216a0bca595ad44ee9dbf@kernel.org>
Date: Thu, 11 Jul 2024 09:07:04 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Oleg Nesterov <oleg@...hat.com>
Cc: andrii@...nel.org, peterz@...radead.org, clm@...a.com, jolsa@...nel.org,
 mingo@...nel.org, paulmck@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] uprobes: document the usage of mm->mmap_lock

On Wed, 10 Jul 2024 17:10:07 +0200
Oleg Nesterov <oleg@...hat.com> wrote:

> On 07/10, Masami Hiramatsu wrote:
> >
> > On Wed, 10 Jul 2024 16:00:45 +0200
> > Oleg Nesterov <oleg@...hat.com> wrote:
> >
> > > The comment above uprobe_write_opcode() is wrong, unapply_uprobe() calls
> > > it under mmap_read_lock() and this is correct.
> > >
> > > And it is completely unclear why register_for_each_vma() takes mmap_lock
> > > for writing, add a comment to explain that mmap_write_lock() is needed to
> > > avoid the following race:
> > >
> > > 	- A task T hits the bp installed by uprobe and calls
> > > 	  find_active_uprobe()
> > >
> > > 	- uprobe_unregister() removes this uprobe/bp
> > >
> > > 	- T calls find_uprobe() which returns NULL
> > >
> > > 	- another uprobe_register() installs the bp at the same address
> > >
> > > 	- T calls is_trap_at_addr() which returns true
> > >
> > > 	- T returns to handle_swbp() and gets SIGTRAP.
> 
> ...
> 
> > >  int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
> > > @@ -1046,7 +1046,12 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
> > >
> > >  		if (err && is_register)
> > >  			goto free;
> > > -
> > > +		/*
> > > +		 * We take mmap_lock for writing to avoid the race with
> > > +		 * find_active_uprobe(), install_breakpoint() must not
> > > +		 * make is_trap_at_addr() true right after find_uprobe()
> > > +		 * returns NULL.
> >
> > Sorry, I couldn't catch the latter part. What is the relationship of
> > taking the mmap_lock and install_breakpoint() and is_trap_at_addr() here?
> 
> Please the the changelog above, it tries to explain this race with more
> details...

OK, but it seems we should write the above longer explanation here.
What about the comment like this?

/*
 * We take mmap_lock for writing to avoid the race with
 * find_active_uprobe() and is_trap_at_adder() in reader
 * side.
 * If the reader, which hits a swbp and is handling it,
 * does not take mmap_lock for reading, it is possible
 * that find_active_uprobe() returns NULL (because
 * uprobe_unregister() removes uprobes right before that),
 * but is_trap_at_addr() can return true afterwards (because
 * another thread calls uprobe_register() on the same address).
 * This causes unexpected SIGTRAP on reader thread.
 * Taking mmap_lock avoids this race.
*/

> 
> > You meant that find_active_uprobe() is using find_uprobe() which searchs
> > uprobe form rbtree?
> 
> Yes,
> 
> > But it seems uprobe is already inserted to the rbtree
> > in alloc_uprobe() so find_uprobe() will not return NULL here, right?
> 
> uprobe_register() -> alloc_uprobe() can come after
> find_active_uprobe() -> find_uprobe() returns NULL.
> 
> Now, if uprobe_register() -> register_for_each_vma() used mmap_read_lock(), it
> could do install_breakpoint() before find_active_uprobe() calls is_trap_at_addr().
> 
> In this case find_active_uprobe() returns with uprobe == NULL and is_swbp == 1,
> handle_swbp() treat this case as the "normal" int3 without uprobe and do
> 
> 	if (!uprobe) {
> 		if (is_swbp > 0) {
> 			/* No matching uprobe; signal SIGTRAP. */
> 			force_sig(SIGTRAP);
> 
> Does this answer your question?

No, thanks for the explanation.

Thank you!

> 
> Oleg.
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>