linux-kernel - Re: One (possible) x86 get_user

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <4D41A651020000780002ED36@vpn.id2.novell.com>
Date:	Thu, 27 Jan 2011 16:07:29 +0000
From:	"Jan Beulich" <JBeulich@...ell.com>
To:	"Xiaowei Yang" <xiaowei.yang@...wei.com>,
	"Nick Piggin" <npiggin@...nel.dk>
Cc:	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	<fanhenglong@...wei.com>, "Kaushik Barde" <kbarde@...wei.com>,
	"Kenneth Lee" <liguozhu@...wei.com>,
	"linqaingmin" <linqiangmin@...wei.com>, <wangzhenguo@...wei.com>,
	"Wu Fengguang" <fengguang.wu@...el.com>,
	"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
	<linux-kernel@...r.kernel.org>
Subject: Re: One (possible) x86 get_user_pages bug

>>> On 27.01.11 at 14:05, Xiaowei Yang <xiaowei.yang@...wei.com> wrote:
> We created a scenario to reproduce the bug:
> ----------------------------------------------------------------
> // proc1/proc1.2 are 2 threads sharing one page table.
> // proc1 is the parent of proc2.
> 
> proc1               proc2          proc1.2
> ...                 ...            // in gup_pte_range()
> ...                 ...            pte = gup_get_pte()
> ...                 ...            page1 = pte_page(pte)  // (1)
> do_wp_page(page1)   ...            ...
> ...                 exit_map()     ...
> ...                 ...            get_page(page1)        // (2)
> -----------------------------------------------------------------
> 
> do_wp_page() and exit_map() cause page1 to be released into free list 
> before get_page() in proc1.2 is called. The longer the delay between 
> (1)&(2), the easier the BUG_ON shows.

Other than responded initially, I don't this can happen outside
of Xen: do_wp_page() won't reach page_cache_release() when
gup_pte_range() is running for the same mm on another CPU,
since it can't get past ptep_clear_flush() (waiting for the CPU
in get_user_pages_fast() to re-enable interrupts).

> An experimental patch is made to prevent the PTE being modified in the 
> middle of gup_pte_range(). The BUG_ON disappears afterward.
> 
> However, from the comments embedded in gup.c, it seems deliberate to 
> avoid the lock in the fast path. The question is: if so, how to avoid 
> the above scenario?

Nick, based on your doing of the initial implementation, would
you be able to estimate whether disabling get_user_pages_fast()
altogether for Xen would be performing measurably worse than
adding the locks (but continuing to avoid acquiring mm->mmap_sem)
as suggested by Xiaowei? That's of course only if the latter is correct
at all, of which I haven't fully convinced myself yet.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/