[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y6L51cR5EZ//cw8J@sashalap>
Date: Wed, 21 Dec 2022 07:19:33 -0500
From: Sasha Levin <sashal@...nel.org>
To: Dave Chinner <david@...morbit.com>
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
Dave Chinner <dchinner@...hat.com>,
Christoph Hellwig <hch@....de>,
"Darrick J . Wong" <djwong@...nel.org>, hch@...radead.org,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 6.1 13/16] iomap: write iomap validity checks
On Tue, Dec 20, 2022 at 03:01:12PM +1100, Dave Chinner wrote:
>On Mon, Dec 19, 2022 at 08:20:50PM -0500, Sasha Levin wrote:
>> From: Dave Chinner <dchinner@...hat.com>
>>
>> [ Upstream commit d7b64041164ca177170191d2ad775da074ab2926 ]
>>
>> A recent multithreaded write data corruption has been uncovered in
>> the iomap write code. The core of the problem is partial folio
>> writes can be flushed to disk while a new racing write can map it
>> and fill the rest of the page:
>>
>> writeback new write
>>
>> allocate blocks
>> blocks are unwritten
>> submit IO
>> .....
>> map blocks
>> iomap indicates UNWRITTEN range
>> loop {
>> lock folio
>> copyin data
>> .....
>> IO completes
>> runs unwritten extent conv
>> blocks are marked written
>> <iomap now stale>
>> get next folio
>> }
>>
>> Now add memory pressure such that memory reclaim evicts the
>> partially written folio that has already been written to disk.
>>
>> When the new write finally gets to the last partial page of the new
>> write, it does not find it in cache, so it instantiates a new page,
>> sees the iomap is unwritten, and zeros the part of the page that
>> it does not have data from. This overwrites the data on disk that
>> was originally written.
>>
>> The full description of the corruption mechanism can be found here:
>>
>> https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/
>>
>> To solve this problem, we need to check whether the iomap is still
>> valid after we lock each folio during the write. We have to do it
>> after we lock the page so that we don't end up with state changes
>> occurring while we wait for the folio to be locked.
>>
>> Hence we need a mechanism to be able to check that the cached iomap
>> is still valid (similar to what we already do in buffered
>> writeback), and we need a way for ->begin_write to back out and
>> tell the high level iomap iterator that we need to remap the
>> remaining write range.
>>
>> The iomap needs to grow some storage for the validity cookie that
>> the filesystem provides to travel with the iomap. XFS, in
>> particular, also needs to know some more information about what the
>> iomap maps (attribute extents rather than file data extents) to for
>> the validity cookie to cover all the types of iomaps we might need
>> to validate.
>>
>> Signed-off-by: Dave Chinner <dchinner@...hat.com>
>> Reviewed-by: Christoph Hellwig <hch@....de>
>> Reviewed-by: Darrick J. Wong <djwong@...nel.org>
>> Signed-off-by: Sasha Levin <sashal@...nel.org>
>
>This commit is not a standalone backport candidate. It is a pure
>infrastructure change that does nothing by itself except to add more
>code that won't get executed. There are another 7-8 patches that
>need to be backported along with this patch to fix the data
>corruption that is mentioned in this commit.
>
>I'd stronly suggest that you leave this whole series of commits to
>the XFS LTS maintainers to backport if they so choose to - randomly
>backporting commits from the middle of the series only makes their
>job more complex....
Ack, I'll drop it, thanks!
--
Thanks,
Sasha
Powered by blists - more mailing lists