linux-kernel - Re: pipe/page fault oddness.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141002142503.GA13203@node.dhcp.inet.fi>
Date:	Thu, 2 Oct 2014 17:25:03 +0300
From:	"Kirill A. Shutemov" <kirill@...temov.name>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Sasha Levin <sasha.levin@...cle.com>,
	Hugh Dickins <hughd@...gle.com>, Dave Jones <davej@...hat.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Michel Lespinasse <walken@...gle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Mel Gorman <mgorman@...e.de>
Subject: Re: pipe/page fault oddness.

On Wed, Oct 01, 2014 at 03:42:53PM -0700, Linus Torvalds wrote:
> On Wed, Oct 1, 2014 at 3:08 PM, Sasha Levin <sasha.levin@...cle.com> wrote:
> >
> > I've tried this patch on the same configuration that was triggering
> > the VM_BUG_ON that Hugh mentioned previously. Surprisingly enough it
> > ran fine for ~20 minutes before exploding with:
> 
> Well, that's somewhat encouraging. I didn't expect it to be perfect.
> 
> That said, "ran fine" isn't necessarily the same thing as "worked".
> Who knows how buggy it was without showing overt symptoms until the
> BUG_ON() triggered. But hey, I'll be optimistic.
> 
> > [ 2781.566206] kernel BUG at mm/huge_memory.c:1293!
> 
> So that's
> 
>         BUG_ON(is_huge_zero_page(page));
> 
> and the reason is trivial: the old code used to have a magical special
> case for the zero-page hugepage (see change_huge_pmd()) and I got rid
> of that (because now it's just about setting protections, and the
> zero-page hugepage is in no way special.
> 
> So I think the solution is equally trivial: just accept that the
> zero-page can happen, and ignore it (just un-numa it).
> 
> Appended is a incremental diff on top of the previous one. Even less
> tested than the last case, but I think you get the idea if it doesn't
> work as-is.
> 
>              Linus

> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 14de54af6c38..fc33952d59c4 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1290,7 +1290,9 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
>  	}
>  
>  	page = pmd_page(pmd);
> -	BUG_ON(is_huge_zero_page(page));
> +	if (is_huge_zero_page(page))
> +		goto huge_zero_page;
> +
>  	page_nid = page_to_nid(page);
>  	last_cpupid = page_cpupid_last(page);
>  	count_vm_numa_event(NUMA_HINT_FAULTS);
> @@ -1381,6 +1383,11 @@ out:
>  		task_numa_fault(last_cpupid, page_nid, HPAGE_PMD_NR, flags);
>  
>  	return 0;
> +huge_zero_page:
> +	pmd = pmd_modify(pmd, vma->vm_page_prot);
> +	set_pmd_at(mm, haddr, pmdp, pmd);
> +	update_mmu_cache_pmd(vma, addr, pmdp);
> +	goto out_unlock;

I don't see what prevents the code to make zero page writable here.
We need at least pmd = pmd_wrprotect(pmd) before set_pmd_at();

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/