[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20150113144753.ea2658cdf1a78e1b8cbdb576@linux-foundation.org>
Date: Tue, 13 Jan 2015 14:47:53 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Matthew Wilcox <willy@...ux.intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@...el.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v12 08/20] dax,ext2: Replace the XIP page fault handler
with the DAX page fault handler
On Tue, 13 Jan 2015 16:53:34 -0500 Matthew Wilcox <willy@...ux.intel.com> wrote:
> /*
> * Lock ordering in mm:
> *
> * inode->i_mutex (while writing or truncating, not reading or faulting)
> * mm->mmap_sem
>
> > > In the worst case, the file still has blocks
> > > + * allocated past the end of the file.
> > > + */
> > > + size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > > + if (unlikely(vmf->pgoff >= size)) {
> > > + error = -EIO;
> > > + goto out;
> > > + }
> >
> > How does this play with holepunching? Checking i_size won't work there?
>
> It doesn't. But the same problem exists with non-DAX files too, and
> when I pointed it out, it was met with a shrug from the crowd. I saw a
> patch series just recently that fixes it for XFS, but as far as I know,
> btrfs and ext4 still don't play well with pagefault vs hole-punch races.
What are the user-visible effects of the race?
> > > + memset(&bh, 0, sizeof(bh));
> > > + block = (sector_t)vmf->pgoff << (PAGE_SHIFT - blkbits);
> > > + bh.b_size = PAGE_SIZE;
> >
> > ah, there.
> >
> > PAGE_SIZE varies a lot between architectures. What are the
> > implications of this>?
>
> At the moment, you can only do DAX for blocksizes that are equal to
> PAGE_SIZE. That's a restriction that existed for the previous XIP code,
> and I haven't fixed it all for DAX yet. I'd like to, but it's not high on
> my list of things to fix. Since these are in-mmeory filesystems, there's
> not likely to be high demand to move the filesystem between machines.
hm, I guess not.
This means that our users will need to mkfs their filesystems with
blocksize==pagesize. The "error: unsupported blocksize for dax" printk
should get the message across, but a mention in
Documentation/filesystems/dax.txt's "Shortcomings" section wouldn't
hurt.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists