linux-kernel - Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim and use a

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100615162213.GA2470@infradead.org>
Date:	Tue, 15 Jun 2010 12:22:13 -0400
From:	Christoph Hellwig <hch@...radead.org>
To:	Andrea Arcangeli <aarcange@...hat.com>
Cc:	Mel Gorman <mel@....ul.ie>, Christoph Hellwig <hch@...radead.org>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Nick Piggin <npiggin@...e.de>, Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim
 and use a_ops->writepages() where possible

On Tue, Jun 15, 2010 at 06:14:19PM +0200, Andrea Arcangeli wrote:
> On Tue, Jun 15, 2010 at 04:38:38PM +0100, Mel Gorman wrote:
> > That is pretty much what Dave is claiming here at
> > http://lkml.org/lkml/2010/4/13/121 where if mempool_alloc_slab() needed
> 
> This stack trace shows writepage called by shrink_page_list... that
> contradict Christoph's claim that xfs already won't writepage if
> invoked by direct reclaim.

We only recently did that - before that we tried to get the VM fixed
multiple times but finally had to bite the bullet and follow ext4 and
btrfs in that regard.

> Again not what looks like from the stack trace. Also grepping for
> PF_MEMALLOC in fs/xfs shows nothing. In fact it's ext4_write_inode
> that skips the write if PF_MEMALLOC is set, not writepage apparently
> (only did a quick grep so I might be wrong). I suspect
> ext4_write_inode is the case I just mentioned about slab shrink, not
> ->writepage ;).

ext4 in fact does not check PF_MEMALLOC but simply refuses to write
out anything in ->writepage in most cases.  There is a corner case
when the page doesn't have any buffers attached where it wouldn't
have write out data, without actually calling the allocator.  I
suspect this code actually is a leftover as we don't normally strip
buffers from a page that had them before.

> inodes are small, it's no big deal to keep an inode pinned and not
> slab-reclaimable because dirty, while skipping real writepage in
> memory pressure could really open a regression in oom false positives!
> One pagecache much bigger than one inode and there can be plenty more
> dirty pagecache than inodes.

At least for XFS ->write_inode is really simple these days.  If it's
a synchronous writeout, which won't happen from these path it logs the
inode, which is far less harmless than the whole allocator code, and
for write = 0 it only adds it to the delayed write queue, which doesn't
call into the I/O stack at all.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/