[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150927195655.18e20003@tlielax.poochiereds.net>
Date: Sun, 27 Sep 2015 19:56:55 -0400
From: Jeff Layton <jlayton@...chiereds.net>
To: Dave Chinner <david@...morbit.com>
Cc: Hugh Dickins <hughd@...gle.com>, angelo <angelo70@...il.com>,
Al Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Hellwig <hch@...radead.org>,
Andi Kleen <andi@...stfloor.org>, Eryu Guan <eguan@...hat.com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH] mm: fix cpu hangs on truncating last page of a 16t
sparse file
On Mon, 28 Sep 2015 09:26:45 +1000
Dave Chinner <david@...morbit.com> wrote:
> On Sun, Sep 27, 2015 at 10:59:33AM -0700, Hugh Dickins wrote:
> > On Sun, 27 Sep 2015, angelo wrote:
> > > On 27/09/2015 03:36, Hugh Dickins wrote:
> > > > Let's Cc linux-fsdevel, who will be more knowledgable.
> > > >
> > > > On Sun, 27 Sep 2015, angelo wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > running xfstests, generic 308 on whatever 32bit arch is possible
> > > > > to observe cpu to hang near 100% on unlink.
> >
> > I have since tried to repeat your result, but generic/308 on 32-bit just
> > skipped the test for me. I didn't investigate why: it's quite possible
> > that I had a leftover 64-bit executable in the path that it tried to use,
> > but didn't show the relevant error message.
> >
> > I did verify your result with a standalone test; and that proves that
> > nobody has actually been using such files in practice before you,
> > since unmounting the xfs filesystem would hang in the same way if
> > they didn't unlink them.
>
> It used to work - this is a regression. Just because nobody has
> reported it recently simply means nobody has run xfstests on 32 bit
> storage recently. There are 32 bit systems out there that expect
> this to work, and we've broken it.
>
> The regression was introduced in 3.11 by this commit:
>
> commit 5a7203947a1d9b6f3a00a39fda08c2466489555f
> Author: Lukas Czerner <lczerner@...hat.com>
> Date: Mon May 27 23:32:35 2013 -0400
>
> mm: teach truncate_inode_pages_range() to handle non page aligned ranges
>
> This commit changes truncate_inode_pages_range() so it can handle non
> page aligned regions of the truncate. Currently we can hit BUG_ON when
> the end of the range is not page aligned, but we can handle unaligned
> start of the range.
>
> Being able to handle non page aligned regions of the page can help file
> system punch_hole implementations and save some work, because once we're
> holding the page we might as well deal with it right away.
>
> In previous commits we've changed ->invalidatepage() prototype to accept
> 'length' argument to be able to specify range to invalidate. No we can
> use that new ability in truncate_inode_pages_range().
>
> Signed-off-by: Lukas Czerner <lczerner@...hat.com>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Hugh Dickins <hughd@...gle.com>
> Signed-off-by: Theodore Ts'o <tytso@....edu>
>
>
> > > > > The test removes a sparse file of length 16tera where only the last
> > > > > 4096 bytes block is mapped.
> > > > > At line 265 of truncate.c there is a
> > > > > if (index >= end)
> > > > > break;
> > > > > But if index is, as in this case, a 4294967295, it match -1 used as
> > > > > eof. Hence the cpu loops 100% just after.
> > > > >
> > > > That's odd. I've not checked your patch, because I think the problem
> > > > would go beyond truncate, and the root cause lie elsewhere.
> > > >
> > > > My understanding is that the 32-bit
> > > > #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
> > > > makes a page->index of -1 (or any "negative") impossible to reach.
>
> We've supported > 8TB files on 32 bit XFS file systems since
> since mid 2003:
>
> http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=d13d78f6b83eefbd90a6cac5c9fbe42560c6511e
>
> And it's been documented as such for a long time, too:
>
> http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s04.html
>
> (that was written, IIRC, back in 2007).
>
> i.e. whatever the definition says about MAX_LFS_FILESIZE being an
> 8TB limit on 32 bit is stale and has been for a very long time.
>
> > A surprise to me, and I expect to others, that 32-bit xfs is not
> > respecting MAX_LFS_FILESIZE: going its own way with 0xfff ffffffff
> > instead of 0x7ff ffffffff (on a PAGE_CACHE_SIZE 4096 system).
> >
> > MAX_LFS_FILESIZE has been defined that way ever since v2.5.4:
> > this is probably just an oversight from when xfs was later added
> > into the Linux tree.
>
> We supported >8 TB file offsets on 32 bit systems on 2.4 kernels
> with XFS, so it sounds like it was wrong even when it was first
> committed. Of course, XFS wasn't merged until 2.5.36, so I guess
> nobody realised... ;)
>
> > > But if s_maxbytes doesn't have to be greater than MAX_LFS_FILESIZE,
> > > i agree the issue should be fixed in layers above.
> >
> > There is a "filesystems should never set s_maxbytes larger than
> > MAX_LFS_FILESIZE" comment in fs/super.c, but unfortunately its
> > warning is written with just 64-bit in mind (testing for negative).
>
> Yup, introduced here:
>
> commit 42cb56ae2ab67390da34906b27bedc3f2ff1393b
> Author: Jeff Layton <jlayton@...hat.com>
> Date: Fri Sep 18 13:05:53 2009 -0700
>
> vfs: change sb->s_maxbytes to a loff_t
>
> sb->s_maxbytes is supposed to indicate the maximum size of a file that can
> exist on the filesystem. It's declared as an unsigned long long.
>
> And yes, that will never fire on a 32bit filesystem, because loff_t
> is a "long long" type....
>
Hmm...should we change that to something like this instead?
WARN(((unsigned long long)sb->s_maxbytes > (unsigned long long)MAX_LFS_FILESIZE,
"%s set sb->s_maxbytes to too large a value (0x%llx)\n", type->name, sb->s_maxbytes);
--
Jeff Layton <jlayton@...chiereds.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists