lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Jul 2014 10:01:57 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Hugh Dickins <hughd@...gle.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Dave Jones <davej@...hat.com>,
	Dave Chinner <david@...morbit.com>, xfs@....sgi.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] mm: fix direct reclaim writeback regression

On Sat, Jul 26, 2014 at 12:58:23PM -0700, Hugh Dickins wrote:
> Shortly before 3.16-rc1, Dave Jones reported:
> 
> WARNING: CPU: 3 PID: 19721 at fs/xfs/xfs_aops.c:971
>          xfs_vm_writepage+0x5ce/0x630 [xfs]()
> CPU: 3 PID: 19721 Comm: trinity-c61 Not tainted 3.15.0+ #3
> Call Trace:
>  [<ffffffffc023068e>] xfs_vm_writepage+0x5ce/0x630 [xfs]
>  [<ffffffff8316f759>] shrink_page_list+0x8f9/0xb90
>  [<ffffffff83170123>] shrink_inactive_list+0x253/0x510
>  [<ffffffff83170c93>] shrink_lruvec+0x563/0x6c0
>  [<ffffffff83170e2b>] shrink_zone+0x3b/0x100
>  [<ffffffff831710e1>] shrink_zones+0x1f1/0x3c0
>  [<ffffffff83171414>] try_to_free_pages+0x164/0x380
>  [<ffffffff83163e52>] __alloc_pages_nodemask+0x822/0xc90
>  [<ffffffff831abeff>] alloc_pages_vma+0xaf/0x1c0
>  [<ffffffff8318a931>] handle_mm_fault+0xa31/0xc50
> etc.
> 
>  970   if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) ==
>  971                   PF_MEMALLOC))
> 
> I did not respond at the time, because a glance at the PageDirty block
> in shrink_page_list() quickly shows that this is impossible: we don't do
> writeback on file pages (other than tmpfs) from direct reclaim nowadays.
> Dave was hallucinating, but it would have been disrespectful to say so.
> 
> However, my own /var/log/messages now shows similar complaints
> WARNING: CPU: 1 PID: 28814 at fs/ext4/inode.c:1881 ext4_writepage+0xa7/0x38b()
> WARNING: CPU: 0 PID: 27347 at fs/ext4/inode.c:1764 ext4_writepage+0xa7/0x38b()
> from stressing some mmotm trees during July.
> 
> Could a dirty xfs or ext4 file page somehow get marked PageSwapBacked,
> so fail shrink_page_list()'s page_is_file_cache() test, and so proceed
> to mapping->a_ops->writepage()?
> 
> Yes, 3.16-rc1's 68711a746345 ("mm, migration: add destination page
> freeing callback") has provided such a way to compaction: if migrating
> a SwapBacked page fails, its newpage may be put back on the list for
> later use with PageSwapBacked still set, and nothing will clear it.
>
> Whether that can do anything worse than issue WARN_ON_ONCEs, and get
> some statistics wrong, is unclear: easier to fix than to think through
> the consequences.
> 
> Fixing it here, before the put_new_page(), addresses the bug directly,
> but is probably the worst place to fix it.  Page migration is doing too
> many parts of the job on too many levels: fixing it in move_to_new_page()
> to complement its SetPageSwapBacked would be preferable, except why is it
> (and newpage->mapping and newpage->index) done there, rather than down in
> migrate_page_move_mapping(), once we are sure of success?  Not a cleanup
> to get into right now, especially not with memcg cleanups coming in 3.17.

That needs verification that no ->migratepage() expects mapping
(working PageAnon()) and index to be set up on newpage.

The freelist putback looks quite fragile, we should probably add
something like free_pages_prepare() / free_page_check() in there.

> Reported-by: Dave Jones <davej@...hat.com>
> Signed-off-by: Hugh Dickins <hughd@...gle.com>

Acked-by: Johannes Weiner <hannes@...xchg.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ