[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160306233330.GA23851@node.shutemov.name>
Date: Mon, 7 Mar 2016 02:33:30 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: Dave Chinner <david@...morbit.com>
Cc: Hugh Dickins <hughd@...gle.com>,
Dave Hansen <dave.hansen@...el.com>,
linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Christoph Lameter <cl@...two.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Jerome Marchand <jmarchan@...hat.com>,
Yang Shi <yang.shi@...aro.org>,
Sasha Levin <sasha.levin@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: THP-enabled filesystem vs. FALLOC_FL_PUNCH_HOLE
On Mon, Mar 07, 2016 at 10:03:36AM +1100, Dave Chinner wrote:
> On Sun, Mar 06, 2016 at 03:30:34AM +0300, Kirill A. Shutemov wrote:
> > On Sun, Mar 06, 2016 at 09:38:11AM +1100, Dave Chinner wrote:
> > > And it's not just hole punching that has this problem. Direct IO is
> > > going to have the same issue with invalidation of the mapped ranges
> > > over the IO being done. XFS already WARNs when page cache
> > > invalidation fails with EBUSY in direct IO, because that is
> > > indicative of an application with a potential data corruption vector
> > > and there's nothing we can do in the kernel code to prevent it.
> >
> > My current understanding is that for filesystems with persistent storage,
> > in order to make THP any useful, we would need to implement writeback
> > without splitting the huge page.
>
> Algorithmically it is no different to filesytem block size < page
> size writeback.
>
> > At the moment, I have no idea how hard it would be..
>
> THP support would effectively require us to remove PAGE_CACHE_SIZE
> assumptions from all of the filesystem and buffer code. That's a
> large chunk of work e.g. fs/buffer.c and any filesystem that uses
> bufferheads for tracking filesystem block state through the page
> cache.
I'll try to learn more about the code before the summit.
I guess it's something worth descussion in person.
--
Kirill A. Shutemov
Powered by blists - more mailing lists