linux-kernel - Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160531060712.GC12670@dastard>
Date:	Tue, 31 May 2016 16:07:12 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Minchan Kim <minchan@...nel.org>
Cc:	Stefan Priebe - Profihost AG <s.priebe@...fihost.ag>,
	Brian Foster <bfoster@...hat.com>,
	"xfs@....sgi.com" <xfs@....sgi.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: shrink_active_list/try_to_release_page bug? (was Re: xfs trace
 in 4.4.2 / also in 4.3.3 WARNING fs/xfs/xfs_aops.c:1232 xfs_vm_releasepage)

On Tue, May 31, 2016 at 12:59:04PM +0900, Minchan Kim wrote:
> On Tue, May 31, 2016 at 12:55:09PM +1000, Dave Chinner wrote:
> > On Tue, May 31, 2016 at 10:07:24AM +0900, Minchan Kim wrote:
> > > On Tue, May 31, 2016 at 08:36:57AM +1000, Dave Chinner wrote:
> > > > But this is a dirty page, which means it may have delalloc or
> > > > unwritten state on it's buffers, both of which indicate that there
> > > > is dirty data in teh page that hasn't been written. XFS issues a
> > > > warning on this because neither shrink_active_list nor
> > > > try_to_release_page() check for whether the page is dirty or not.
> > > > 
> > > > Hence it seems to me that shrink_active_list() is calling
> > > > try_to_release_page() inappropriately, and XFS is just the
> > > > messenger. If you turn laptop mode on, it is likely the problem will
> > > > go away as kswapd will run with .may_writepage = false, but that
> > > > will also cause other behavioural changes relating to writeback and
> > > > memory reclaim. It might be worth trying as a workaround for now.
> > > > 
> > > > MM-folk - is this analysis correct? If so, why is
> > > > shrink_active_list() calling try_to_release_page() on dirty pages?
> > > > Is this just an oversight or is there some problem that this is
> > > > trying to work around? It seems trivial to fix to me (add a
> > > > !PageDirty check), but I don't know why the check is there in the
> > > > first place...
> > > 
> > > It seems to be latter.
> > > Below commit seems to be related.
> > > [ecdfc9787fe527, Resurrect 'try_to_free_buffers()' VM hackery.]
> > 
> > Okay, that's been there a long, long time (2007), and it covers a
> > case where the filesystem cleans pages without the VM knowing about
> > it (i.e. it marks bufferheads clean without clearing the PageDirty
> > state).
> > 
> > That does not explain the code in shrink_active_list().
> 
> Yeb, My point was the patch removed the PageDirty check in
> try_to_free_buffers.

*nod*

[...]

> And I found a culprit.
> e182d61263b7d5, [PATCH] buffer_head takedown for bighighmem machines

Heh. You have the combined historic tree sitting around for code
archeology, just like I do :)

> It introduced pagevec_strip wich calls try_to_release_page without
> PageDirty check in refill_inactive_zone which is shrink_active_list
> now.

<sigh>

It was merged 2 days before XFS was merged. Merging XFS made the
code Andrew wrote incorrect:

> Quote from
> "
>     In refill_inactive(): if the number of buffer_heads is excessive then
>     strip buffers from pages as they move onto the inactive list.  This
>     change is useful for all filesystems. [....]

Except for those that carry state necessary for writeback to be done
correctly on the dirty page bufferheads.  At the time, nobody doing
work the mm/writeback code cared about delayed allocation. So we've
carried this behaviour for 14 years without realising that it's
probably the source of all the unexplainable warnings we've got from
XFS over all that time.

I'm half tempted at this point to mostly ignore this mm/ behavour
because we are moving down the path of removing buffer heads from
XFS. That will require us to do different things in ->releasepage
and so just skipping dirty pages in the XFS code is the best thing
to do....

Cheers,

Dave.

-- 
Dave Chinner
david@...morbit.com