lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jul 2014 13:46:26 -0700 (PDT)
From:	Hugh Dickins <hughd@...gle.com>
To:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
cc:	Konstantin Khlebnikov <koct9i@...il.com>,
	Hugh Dickins <hughd@...gle.com>,
	Ingo Korb <ingo.korb@...dortmund.de>,
	Ning Qu <quning@...gle.com>, Dave Jones <davej@...hat.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on
 process exit

On Tue, 15 Jul 2014, Kirill A. Shutemov wrote:
> Konstantin Khlebnikov wrote:
> > On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> > <kirill.shutemov@...ux.intel.com> wrote:
> > > Konstantin Khlebnikov wrote:
> > >> It seems boundng logic in do_fault_around is wrong:
> > >>
> > >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> > >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> > >> pte -= off;
> > >> pgoff -= off;
> > >>
> > >> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
> > >
> > > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > > boundary in this case which is start of the page table pte belong to.
> > >
> > > Do I miss something?
> > 
> > Nope, you're right. This fixes kernel crash but not the original problem.
> > 
> > Problem is caused by calling do_fault_around for _non-linear_ faiult.
> > In this case pgoff is shifted and might become negative during calculation.
> > I'll send another patch.
> 
> I've got to the same conclusion. My patch is below.

Many thanks to Ingo and Konstantin and Kirill for nailing this.
So now we have two not-quite-identical patches to fix it.
I feel I have to judge a beauty contest.

I think my slight preference is for Kirill's below, because it has
a better description (mentions "kernel BUG at mm/filemap.c:202!" and
Ccs stable) and uses the familiar VM_NONLINEAR flag rather than the
never-heard-of-before-and-otherwise-unused FAULT_FLAG_NONLINEAR.

But please please add a credit to Ingo, who made the breakthrough for
us, and to Konstantin who analysed what was going on.  Ingo, this is
not quite the version you tested...

... ah, forget it, Andrew has just now gone for Konstantin's,
adding in more info from Kirill's: that's fine.

Thanks all,
Hugh

> 
> From dd761b693cd06c649499e913713ae5bc7c029f6e Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> Date: Tue, 15 Jul 2014 14:40:02 +0300
> Subject: [PATCH] mm: avoid do_fault_around() on non-linear mappings
> 
> Originally, I've wrongly assumed that non-linear mapping are always
> populated at least with pte_file() entries there, so !pte_none() check
> will catch them. It's not always the case: we can get there from
> __mm_populte in remap_file_pages() and pte will be clear.

__mm_populate

> 
> Let's put explicit check for non-linear mapping.
> 
> This is a root cause of recent "kernel BUG at mm/filemap.c:202!".
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Cc: stable@...r.kernel.org # 3.15+
> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d67fd9fcf1f2..440ad48266d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * if page by the offset is not ready to be mapped (cold cache or
>  	 * something).
>  	 */
> -	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> +	if (vma->vm_ops->map_pages && fault_around_pages() > 1 &&
> +			!(vma->vm_flags & VM_NONLINEAR)) {
>  		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>  		do_fault_around(vma, address, pte, pgoff, flags);
>  		if (!pte_same(*pte, orig_pte))
> -- 
> 2.0.1
> 
> -- 
>  Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ