lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 3 Oct 2011 15:37:10 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...e.hu>,
	Steven Rostedt <rostedt@...dmis.org>,
	Linux-mm <linux-mm@...ck.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Hugh Dickins <hughd@...gle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Jonathan Corbet <corbet@....net>,
	Thomas Gleixner <tglx@...utronix.de>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jim Keniston <jkenisto@...ux.vnet.ibm.com>,
	Roland McGrath <roland@...k.frob.com>,
	Andi Kleen <andi@...stfloor.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 3.1.0-rc4-tip 4/26]   uprobes: Define hooks for
	mmap/munmap.

On 09/20, Srikar Dronamraju wrote:
>
> @@ -739,6 +740,10 @@ struct mm_struct *dup_mm(struct task_struct *tsk)
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  	mm->pmd_huge_pte = NULL;
>  #endif
> +#ifdef CONFIG_UPROBES
> +	atomic_set(&mm->mm_uprobes_count,
> +			atomic_read(&oldmm->mm_uprobes_count));

Hmm. Why this can't race with install_breakpoint/remove_breakpoint
between _read and _set ?

What about VM_DONTCOPY vma's with breakpoints ?

> -static int match_uprobe(struct uprobe *l, struct uprobe *r)
> +static int match_uprobe(struct uprobe *l, struct uprobe *r, int *match_inode)
>  {
> +	/*
> +	 * if match_inode is non NULL then indicate if the
> +	 * inode atleast match.
> +	 */
> +	if (match_inode)
> +		*match_inode = 0;
> +
>  	if (l->inode < r->inode)
>  		return -1;
>  	if (l->inode > r->inode)
>  		return 1;
>  	else {
> +		if (match_inode)
> +			*match_inode = 1;
> +

It is very possible I missed something, but imho this looks confusing.

This close_match logic is only needed for build_probe_list() and
dec_mm_uprobes_count(), and both do not actually need the returned
uprobe.

Instead of complicating match_uprobe() and __find_uprobe(), perhaps
it makes sense to add "struct rb_node *__find_close_rb_node(inode)" ?

> +static int install_breakpoint(struct mm_struct *mm, struct uprobe *uprobe)
>  {
>  	/* Placeholder: Yet to be implemented */
> +	if (!uprobe->consumers)
> +		return 0;

How it is possible to see ->consumers == NULL?

OK, afaics it _is_ possible, but only because unregister does del_consumer()
without ->i_mutex, but this is bug afaics (see the previous email).

Another user is mmap_uprobe() and it checks ->consumers != NULL itself (but
see below).

> +int mmap_uprobe(struct vm_area_struct *vma)
> +{
> +	struct list_head tmp_list;
> +	struct uprobe *uprobe, *u;
> +	struct inode *inode;
> +	int ret = 0;
> +
> +	if (!valid_vma(vma))
> +		return ret;	/* Bail-out */
> +
> +	inode = igrab(vma->vm_file->f_mapping->host);
> +	if (!inode)
> +		return ret;
> +
> +	INIT_LIST_HEAD(&tmp_list);
> +	mutex_lock(&uprobes_mmap_mutex);
> +	build_probe_list(inode, &tmp_list);
> +	list_for_each_entry_safe(uprobe, u, &tmp_list, pending_list) {
> +		loff_t vaddr;
> +
> +		list_del(&uprobe->pending_list);
> +		if (!ret && uprobe->consumers) {
> +			vaddr = vma->vm_start + uprobe->offset;
> +			vaddr -= vma->vm_pgoff << PAGE_SHIFT;
> +			if (vaddr < vma->vm_start || vaddr >= vma->vm_end)
> +				continue;
> +			ret = install_breakpoint(vma->vm_mm, uprobe);

So. We are adding the new mapping, we should find all breakpoints this
file has in the start/end range.

We are holding ->mmap_sem... this seems enough to protect against the
races with register/unregister. Except, what if __register_uprobe()
fails? In this case __unregister_uprobe() does delete_uprobe() at the
very end. What if mmap mmap_uprobe() is called right before delete_?

> +static void dec_mm_uprobes_count(struct vm_area_struct *vma,
> +		struct inode *inode)
> +{
> +	struct uprobe *uprobe;
> +	struct rb_node *n;
> +	unsigned long flags;
> +
> +	n = uprobes_tree.rb_node;
> +	spin_lock_irqsave(&uprobes_treelock, flags);
> +	uprobe = __find_uprobe(inode, 0, &n);
> +
> +	/*
> +	 * If indeed there is a probe for the inode and with offset zero,
> +	 * then lets release its reference. (ref got thro __find_uprobe)
> +	 */
> +	if (uprobe)
> +		put_uprobe(uprobe);
> +	for (; n; n = rb_next(n)) {
> +		loff_t vaddr;
> +
> +		uprobe = rb_entry(n, struct uprobe, rb_node);
> +		if (uprobe->inode != inode)
> +			break;
> +		vaddr = vma->vm_start + uprobe->offset;
> +		vaddr -= vma->vm_pgoff << PAGE_SHIFT;
> +		if (vaddr < vma->vm_start || vaddr >= vma->vm_end)
> +			continue;
> +		atomic_dec(&vma->vm_mm->mm_uprobes_count);

So, this does atomic_dec() for each bp in this vma?

And the caller is

> @@ -1337,6 +1338,9 @@ unsigned long unmap_vmas(struct mmu_gather *tlb,
>  		if (unlikely(is_pfn_mapping(vma)))
>  			untrack_pfn_vma(vma, 0, 0);
>
> +		if (vma->vm_file)
> +			munmap_uprobe(vma);

Doesn't look right...

munmap_uprobe() assumes that the whole region goes away. This is
true in munmap() case afaics, it does __split_vma() if necessary.

But what about truncate() ? In this case this vma is not unmapped,
but unmap_vmas() is called anyway and [start, end) can be different.
IOW, unless I missed something (this is very possible) we can do
more atomic_dec's then needed.

Also, truncate() obviously changes ->i_size. Doesn't this mean
unregister_uprobe() should return if offset > i_size ? We need to
free uprobes anyway.

MADV_DONTNEED? It calls unmap_vmas() too. And application can do
madvise(DONTNEED) in a loop.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ