linux-kernel - Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim and use a

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100615163747.GK28052@random.random>
Date:	Tue, 15 Jun 2010 18:37:47 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Mel Gorman <mel@....ul.ie>
Cc:	Christoph Hellwig <hch@...radead.org>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Nick Piggin <npiggin@...e.de>, Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim
 and use a_ops->writepages() where possible

On Tue, Jun 15, 2010 at 05:30:44PM +0100, Mel Gorman wrote:
> See this
> 
> STATIC int
> xfs_vm_writepage(
>         struct page             *page,
>         struct writeback_control *wbc)
> {
>         int                     error;
>         int                     need_trans;
>         int                     delalloc, unmapped, unwritten;
>         struct inode            *inode = page->mapping->host;
> 
>         trace_xfs_writepage(inode, page, 0);
> 
>         /*
>          * Refuse to write the page out if we are called from reclaim
>          * context.
>          *
>          * This is primarily to avoid stack overflows when called from deep
>          * used stacks in random callers for direct reclaim, but disabling
>          * reclaim for kswap is a nice side-effect as kswapd causes rather
>          * suboptimal I/O patters, too.
>          *
>          * This should really be done by the core VM, but until that happens
>          * filesystems like XFS, btrfs and ext4 have to take care of this
>          * by themselves.
>          */
>         if (current->flags & PF_MEMALLOC)
>                 goto out_fail;

so it's under xfs/linux-2.6... ;) I guess this dates back from the
xfs/irix xfs/freebsd days, no prob.

> Again, missing the code to do it and am missing data showing that not
> writing pages in direct reclaim is really a bad idea.

Your code is functionally fine, my point is it's not just writepage as
shown by the PF_MEMALLOC check in ext4.

> Other than the whole "lacking the code" thing and it's still not clear that
> writing from direct reclaim is absolutly necessary for VM stability considering
> it's been ignored today by at least two filesystems. I can add the throttling
> logic if it'd make you happied but I know it'd be at least two weeks
>  before I could start from scratch on a
> stack-switch-based-solution and a PITA considering that I'm not convinced
> it's necessary :)

The reason things are working on I think is because of
wait_on_page_writeback. By the time lots of ram is full with dirty
pdflush and stuff will submit I/O, then VM will still wait on I/O to
complete. Waiting is eating no stack, submitting I/O does instead. So
that explains why everything works fine.

It'd be interesting to verify that things don't fall apart with
current xfs if you swapon ./file_on_xfs instead of /dev/something.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/