lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180119120747.GV6584@dhcp22.suse.cz>
Date:   Fri, 19 Jan 2018 13:07:47 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     "Kirill A. Shutemov" <kirill@...temov.name>
Cc:     Dave Hansen <dave.hansen@...ux.intel.com>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        torvalds@...ux-foundation.org, kirill.shutemov@...ux.intel.com,
        akpm@...ux-foundation.org, hannes@...xchg.org,
        iamjoonsoo.kim@....com, mgorman@...hsingularity.net,
        tony.luck@...el.com, vbabka@...e.cz, aarcange@...hat.com,
        hillf.zj@...baba-inc.com, hughd@...gle.com, oleg@...hat.com,
        peterz@...radead.org, riel@...hat.com, srikar@...ux.vnet.ibm.com,
        vdavydov.dev@...il.com, mingo@...nel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.

On Fri 19-01-18 14:49:17, Kirill A. Shutemov wrote:
> On Fri, Jan 19, 2018 at 11:33:42AM +0100, Michal Hocko wrote:
> > On Fri 19-01-18 13:02:59, Kirill A. Shutemov wrote:
> > > On Thu, Jan 18, 2018 at 06:22:13PM +0100, Michal Hocko wrote:
> > > > On Thu 18-01-18 18:40:26, Kirill A. Shutemov wrote:
> > > > [...]
> > > > > +	/*
> > > > > +	 * Make sure that pages are in the same section before doing pointer
> > > > > +	 * arithmetics.
> > > > > +	 */
> > > > > +	if (page_to_section(pvmw->page) != page_to_section(page))
> > > > > +		return false;
> > > > 
> > > > OK, THPs shouldn't cross memory sections AFAIK. My brain is meltdown
> > > > these days so this might be a completely stupid question. But why don't
> > > > you simply compare pfns? This would be just simpler, no?
> > > 
> > > In original code, we already had pvmw->page around and I thought it would
> > > be easier to get page for the pte intead of looking for pfn for both
> > > sides.
> > > 
> > > We these changes it's no longer the case.
> > > 
> > > Do you care enough to send a patch? :)
> > 
> > Well, memory sections are sparsemem concept IIRC. Unless I've missed
> > something page_to_section is quarded by SECTION_IN_PAGE_FLAGS and that
> > is conditional to CONFIG_SPARSEMEM. THP is a generic code so using it
> > there is wrong unless I miss some subtle detail here.
> > 
> > Comparing pfn should be generic enough.
> 
> Good point.
> 
> What about something like this?
> 
> >From 861f68c555b87fd6c0ccc3428ace91b7e185b73a Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
> Date: Thu, 18 Jan 2018 18:24:07 +0300
> Subject: [PATCH] mm, page_vma_mapped: Drop faulty pointer arithmetics in
>  check_pte()
> 
> Tetsuo reported random crashes under memory pressure on 32-bit x86
> system and tracked down to change that introduced
> page_vma_mapped_walk().
> 
> The root cause of the issue is the faulty pointer math in check_pte().
> As ->pte may point to an arbitrary page we have to check that they are
> belong to the section before doing math. Otherwise it may lead to weird
> results.
> 
> It wasn't noticed until now as mem_map[] is virtually contiguous on flatmem or
> vmemmap sparsemem. Pointer arithmetic just works against all 'struct page'
> pointers. But with classic sparsemem, it doesn't.

it doesn't because each section memap is allocated separately and so
consecutive pfns crossing two sections might have struct pages at
completely unrelated addresses.

> Let's restructure code a bit and replace pointer arithmetic with
> operations on pfns.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Reported-by: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
> Cc: stable@...r.kernel.org
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>

The patch makes sense but there is one more thing to fix here.

[...]
>  static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  {
> +	unsigned long pfn;
> +
>  	if (pvmw->flags & PVMW_MIGRATION) {
>  #ifdef CONFIG_MIGRATION
>  		swp_entry_t entry;
> @@ -41,37 +61,34 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
>  
>  		if (!is_migration_entry(entry))
>  			return false;
> -		if (migration_entry_to_page(entry) - pvmw->page >=
> -				hpage_nr_pages(pvmw->page)) {
> -			return false;
> -		}
> -		if (migration_entry_to_page(entry) < pvmw->page)
> -			return false;
> +
> +		pfn = migration_entry_to_pfn(entry);
>  #else
>  		WARN_ON_ONCE(1);
>  #endif
> -	} else {

now you allow to pass through with uninitialized pfn. We used to return
true in that case so we should probably keep it in this WARN_ON_ONCE
case. Please note that I haven't studied this particular case and the
ifdef is definitely not an act of art but that is a separate topic.

> -		if (is_swap_pte(*pvmw->pte)) {
> -			swp_entry_t entry;
> +	} else if (is_swap_pte(*pvmw->pte)) {
> +		swp_entry_t entry;
>  
> -			entry = pte_to_swp_entry(*pvmw->pte);
> -			if (is_device_private_entry(entry) &&
> -			    device_private_entry_to_page(entry) == pvmw->page)
> -				return true;
> -		}
> +		/* Handle un-addressable ZONE_DEVICE memory */
> +		entry = pte_to_swp_entry(*pvmw->pte);
> +		if (!is_device_private_entry(entry))
> +			return false;
>  
> +		pfn = device_private_entry_to_pfn(entry);
> +	} else {
>  		if (!pte_present(*pvmw->pte))
>  			return false;
>  
> -		/* THP can be referenced by any subpage */
> -		if (pte_page(*pvmw->pte) - pvmw->page >=
> -				hpage_nr_pages(pvmw->page)) {
> -			return false;
> -		}
> -		if (pte_page(*pvmw->pte) < pvmw->page)
> -			return false;
> +		pfn = pte_pfn(*pvmw->pte);
>  	}
>  
> +	if (pfn < page_to_pfn(pvmw->page))
> +		return false;
> +
> +	/* THP can be referenced by any subpage */
> +	if (pfn - page_to_pfn(pvmw->page) >= hpage_nr_pages(pvmw->page))
> +		return false;
> +
>  	return true;
>  }
>  
> -- 
>  Kirill A. Shutemov

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ