[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <9984f58e-826-74c6-1cd4-65366cc01549@google.com>
Date: Tue, 22 Nov 2022 20:02:36 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: hev <r@....cc>, Matthew Wilcox <willy@...radead.org>
cc: Guoqi <chenguoqic@....com>, Huacai Chen <chenhuacai@...ngson.cn>,
Rui Wang <kernel@....cc>, Hugh Dickins <hughd@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH] mm/shmem: Fix undo range for failed fallocate
On Thu, 3 Nov 2022, hev wrote:
> On Wed, Nov 2, 2022 at 10:41 PM Matthew Wilcox <willy@...radead.org> wrote:
> > On Tue, Nov 01, 2022 at 11:22:48AM +0800, Rui Wang wrote:
> > > This patch fixes data loss caused by the fallocate system
> > > call interrupted by a signal.
> > >
> > > Bug: https://lore.kernel.org/linux-mm/33b85d82.7764.1842e9ab207.Coremail.chenguoqic@163.com/
> > > Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios")
> >
> > How does that commit introduce this bug?
>
> In the test case[1], we created a file that contains non-zero data
> from offset 0 to A-1. and a process try to expand this file by
> fallocate(fd, 0, 0, B), B > A.
> Concurrently, another process try to interrupt this fallocate syscall
> by a signal. I think the expected results are:
>
> 1. The file is not expanded and file size is A, and the data from
> offset 0 to A-1 is not changed.
> 2. The file is expanded and the data from offset 0 to A-1 is not
> changed, and from A to B-1 contains zeros.
>
> Now, the unexpected result is that the file is not expanded and the
> data that from offset 0 to A-1 is changed by
> truncate_inode_partial_folio that called
> from shmem_undo_range with unfalloc = true.
>
> This issue is only reproduced when file on tmpfs, and begin from this
> commit: b9a8a4195c7d ("truncate,shmem: Handle truncates that split
> large folios")
Like Matthew, I was sceptical at first.
But I currently think that you have discovered something important, and
that your patch is the correct fix to it; but I'm still rather confused,
and want to do some more thinking and testing: this mail is mainly to
signal to Matthew that I'm on it, so he doesn't have to rush to look
at it when he's back.
I was able to reproduce it with the test case, once I multiplied both
of the usleep intervals by 10 - before that, it was too difficult for
it to complete a fallocate: guess the timing is different on my x86 box.
Hugh
Powered by blists - more mailing lists