linux-kernel - Re: mm: BUG in unmap_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5410A118.9080803@oracle.com>
Date:	Wed, 10 Sep 2014 15:06:00 -0400
From:	Sasha Levin <sasha.levin@...cle.com>
To:	Mel Gorman <mgorman@...e.de>, Hugh Dickins <hughd@...gle.com>
CC:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Jones <davej@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Cyrill Gorcunov <gorcunov@...il.com>
Subject: Re: mm: BUG in unmap_page_range

On 09/10/2014 08:47 AM, Mel Gorman wrote:
> migrate: debug patch to try identify race between migration completion and mprotect
> 
> A migration entry is marked as write if pte_write was true at the
> time the entry was created. The VMA protections are not double checked
> when migration entries are being removed but mprotect itself will mark
> write-migration-entries as read to avoid problems. It means we potentially
> take a spurious fault to mark these ptes write again but otherwise it's
> harmless.  Still, one dump indicates that this situation can actually
> happen so this debugging patch spits out a warning if the situation occurs
> and hopefully the resulting warning will contain a clue as to how exactly
> it happens
> 
> Not-signed-off
> ---
>  mm/migrate.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 09d489c..631725c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -146,8 +146,16 @@ static int remove_migration_pte(struct page *new, struct vm_area_struct *vma,
>  	pte = pte_mkold(mk_pte(new, vma->vm_page_prot));
>  	if (pte_swp_soft_dirty(*ptep))
>  		pte = pte_mksoft_dirty(pte);
> -	if (is_write_migration_entry(entry))
> -		pte = pte_mkwrite(pte);
> +	if (is_write_migration_entry(entry)) {
> +		/*
> +		 * This WARN_ON_ONCE is temporary for the purposes of seeing if
> +		 * it's a case encountered by trinity in Sasha's testing
> +		 */
> +		if (!(vma->vm_flags & (VM_WRITE)))
> +			WARN_ON_ONCE(1);
> +		else
> +			pte = pte_mkwrite(pte);
> +	}
>  #ifdef CONFIG_HUGETLB_PAGE
>  	if (PageHuge(new)) {
>  		pte = pte_mkhuge(pte);

I seem to have hit this warning:

[ 4782.617806] WARNING: CPU: 10 PID: 21180 at mm/migrate.c:155 remove_migration_pte+0x3f7/0x420()
[ 4782.619315] Modules linked in:
[ 4782.622189]
[ 4782.622501] CPU: 10 PID: 21180 Comm: trinity-main Tainted: G        W      3.17.0-rc4-next-20140910-sasha-00032-g6825fb5-dirty #1137
[ 4782.624344]  0000000000000009 ffff8800193eb770 ffffffffa04c742a 0000000000000000
[ 4782.627801]  ffff8800193eb7a8 ffffffff9d16e55d 00007f2458d89000 ffff880120959600
[ 4782.629283]  ffff88012b02c000 ffffea002abeab00 ffff88063118da90 ffff8800193eb7b8
[ 4782.631353] Call Trace:
[ 4782.633789]  [<ffffffffa04c742a>] dump_stack+0x4e/0x7a
[ 4782.634314]  [<ffffffff9d16e55d>] warn_slowpath_common+0x7d/0xa0
[ 4782.634877]  [<ffffffff9d16e63a>] warn_slowpath_null+0x1a/0x20
[ 4782.635430]  [<ffffffff9d315487>] remove_migration_pte+0x3f7/0x420
[ 4782.636042]  [<ffffffff9d2e99cf>] rmap_walk+0xef/0x380
[ 4782.636544]  [<ffffffff9d3147f1>] remove_migration_ptes+0x41/0x50
[ 4782.637130]  [<ffffffff9d315090>] ? __migration_entry_wait.isra.24+0x160/0x160
[ 4782.639928]  [<ffffffff9d3154b0>] ? remove_migration_pte+0x420/0x420
[ 4782.640616]  [<ffffffff9d31671b>] move_to_new_page+0x16b/0x230
[ 4782.641251]  [<ffffffff9d2e9e8c>] ? try_to_unmap+0x6c/0xf0
[ 4782.643950]  [<ffffffff9d2e88a0>] ? try_to_unmap_nonlinear+0x5c0/0x5c0
[ 4782.644690]  [<ffffffff9d2e70a0>] ? invalid_migration_vma+0x30/0x30
[ 4782.645273]  [<ffffffff9d2e82e0>] ? page_remove_rmap+0x320/0x320
[ 4782.646072]  [<ffffffff9d31717c>] migrate_pages+0x85c/0x930
[ 4782.646701]  [<ffffffff9d2d0e20>] ? isolate_freepages_block+0x410/0x410
[ 4782.647407]  [<ffffffff9d2cfa60>] ? arch_local_save_flags+0x30/0x30
[ 4782.648114]  [<ffffffff9d2d1803>] compact_zone+0x4d3/0x8a0
[ 4782.650157]  [<ffffffff9d2d1c2f>] compact_zone_order+0x5f/0xa0
[ 4782.651014]  [<ffffffff9d2d1f87>] try_to_compact_pages+0x127/0x2f0
[ 4782.651656]  [<ffffffff9d2b0c98>] __alloc_pages_direct_compact+0x68/0x200
[ 4782.652313]  [<ffffffff9d2b17ca>] __alloc_pages_nodemask+0x99a/0xd90
[ 4782.652916]  [<ffffffff9d300a1c>] alloc_pages_vma+0x13c/0x270
[ 4782.653618]  [<ffffffff9d31d914>] ? do_huge_pmd_wp_page+0x494/0xc90
[ 4782.654487]  [<ffffffff9d31d914>] do_huge_pmd_wp_page+0x494/0xc90
[ 4782.656045]  [<ffffffff9d320d20>] ? __mem_cgroup_count_vm_event+0xd0/0x240
[ 4782.657089]  [<ffffffff9d2dcb7d>] handle_mm_fault+0x8bd/0xc50
[ 4782.660931]  [<ffffffff9d1d26e6>] ? __lock_is_held+0x56/0x80
[ 4782.662695]  [<ffffffff9d0c7bc7>] __do_page_fault+0x1b7/0x660
[ 4782.663259]  [<ffffffff9d1cdc5e>] ? put_lock_stats.isra.13+0xe/0x30
[ 4782.663851]  [<ffffffff9d1abf41>] ? vtime_account_user+0x91/0xa0
[ 4782.664419]  [<ffffffff9d2a2c35>] ? context_tracking_user_exit+0xb5/0x1b0
[ 4782.665119]  [<ffffffff9db6e103>] ? __this_cpu_preempt_check+0x13/0x20
[ 4782.665969]  [<ffffffff9d1ce2e2>] ? trace_hardirqs_off_caller+0xe2/0x1b0
[ 4782.666634]  [<ffffffff9d0c8141>] trace_do_page_fault+0x51/0x2b0
[ 4782.667257]  [<ffffffff9d0bee83>] do_async_page_fault+0x63/0xd0
[ 4782.667871]  [<ffffffffa0511018>] async_page_fault+0x28/0x30

Although it wasn't followed by anything else, and I've seen the original issue
getting triggered without this WARN showing up, so it seems like a different,
unrelated issue?


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/