linux-kernel - Re: [PATCH 3/3] uprobes: write_opcode()->__replace_page() can race with try_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120615061217.GC12051@linux.vnet.ibm.com>
Date:	Fri, 15 Jun 2012 11:42:17 +0530
From:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Hugh Dickins <hughd@...gle.com>, Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anton Arapov <anton@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] uprobes: write_opcode()->__replace_page() can race
 with try_to_unmap()

> ---
> Subject: [PATCH] uprobes: write_opcode()->__replace_page() can race with try_to_unmap()
> 
> write_opcode() gets old_page via get_user_pages() and then calls
> __replace_page() which assumes that this old_page is still mapped
> after pte_offset_map_lock().
> 
> This is not true if this old_page was already try_to_unmap()'ed,
> and in this case everything __replace_page() does with old_page
> is wrong. Just for example, put_page() is not balanced.
> 
> I think it is possible to teach __replace_page() to handle this
> unlikely case correctly, but this patch simply changes it to use
> page_check_address() and return -EAGAIN if it fails. The caller
> should notice this error code and retry.
> 
> Note: write_opcode() asks for the cleanups, I'll try to do this
> in a separate patch.
> 
> Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> ---
>  kernel/events/uprobes.c |   41 +++++++++++++----------------------------
>  1 files changed, 13 insertions(+), 28 deletions(-)
> 
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index 9c53bc2..54c8780 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -135,33 +135,17 @@ static loff_t vma_address(struct vm_area_struct *vma, loff_t offset)
>  static int __replace_page(struct vm_area_struct *vma, struct page *page, struct page *kpage)
>  {
>  	struct mm_struct *mm = vma->vm_mm;
> -	pgd_t *pgd;
> -	pud_t *pud;
> -	pmd_t *pmd;
> -	pte_t *ptep;
> -	spinlock_t *ptl;
>  	unsigned long addr;
> -	int err = -EFAULT;
> +	spinlock_t *ptl;
> +	pte_t *ptep;
> 
>  	addr = page_address_in_vma(page, vma);
>  	if (addr == -EFAULT)
> -		goto out;
> -
> -	pgd = pgd_offset(mm, addr);
> -	if (!pgd_present(*pgd))
> -		goto out;
> -
> -	pud = pud_offset(pgd, addr);
> -	if (!pud_present(*pud))
> -		goto out;
> +		return -EFAULT;
> 
> -	pmd = pmd_offset(pud, addr);
> -	if (!pmd_present(*pmd))
> -		goto out;
> -
> -	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
> +	ptep = page_check_address(page, mm, addr, &ptl, 0);
>  	if (!ptep)
> -		goto out;
> +		return -EAGAIN;
> 
>  	get_page(kpage);
>  	page_add_new_anon_rmap(kpage, vma, addr);
> @@ -180,10 +164,8 @@ static int __replace_page(struct vm_area_struct *vma, struct page *page, struct
>  		try_to_free_swap(page);
>  	put_page(page);
>  	pte_unmap_unlock(ptep, ptl);
> -	err = 0;
> 
> -out:
> -	return err;
> +	return 0;
>  }
> 
>  /**
> @@ -228,9 +210,10 @@ static int write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	void *vaddr_old, *vaddr_new;
>  	struct vm_area_struct *vma;
>  	struct uprobe *uprobe;
> +	unsigned long pgoff;
>  	loff_t addr;
>  	int ret;
> -
> +retry:

Just a check on coding style: Shouldnt we have a preceeding blank
line before the goto label.

>  	/* Read the page with vaddr into memory */
>  	ret = get_user_pages(NULL, mm, vaddr, 1, 0, 0, &old_page, &vma);
>  	if (ret <= 0)
> @@ -275,9 +258,9 @@ static int write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
>  	memcpy(vaddr_new, vaddr_old, PAGE_SIZE);
> 
>  	/* poke the new insn in, ASSUMES we don't cross page boundary */
> -	vaddr &= ~PAGE_MASK;
> -	BUG_ON(vaddr + UPROBE_SWBP_INSN_SIZE > PAGE_SIZE);
> -	memcpy(vaddr_new + vaddr, &opcode, UPROBE_SWBP_INSN_SIZE);
> +	pgoff = (vaddr & ~PAGE_MASK);
> +	BUG_ON(pgoff + UPROBE_SWBP_INSN_SIZE > PAGE_SIZE);
> +	memcpy(vaddr_new + pgoff, &opcode, UPROBE_SWBP_INSN_SIZE);
> 
>  	kunmap_atomic(vaddr_new);
>  	kunmap_atomic(vaddr_old);
> @@ -297,6 +280,8 @@ unlock_out:
>  put_out:
>  	put_page(old_page);
> 
> +	if (unlikely(ret == -EAGAIN))
> +		goto retry;
>  	return ret;
>  }
> 

Looks good. 
Acked-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>

-- 
Thanks and Regards
Srikar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/