lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y6L51cR5EZ//cw8J@sashalap>
Date:   Wed, 21 Dec 2022 07:19:33 -0500
From:   Sasha Levin <sashal@...nel.org>
To:     Dave Chinner <david@...morbit.com>
Cc:     linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Dave Chinner <dchinner@...hat.com>,
        Christoph Hellwig <hch@....de>,
        "Darrick J . Wong" <djwong@...nel.org>, hch@...radead.org,
        linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 6.1 13/16] iomap: write iomap validity checks

On Tue, Dec 20, 2022 at 03:01:12PM +1100, Dave Chinner wrote:
>On Mon, Dec 19, 2022 at 08:20:50PM -0500, Sasha Levin wrote:
>> From: Dave Chinner <dchinner@...hat.com>
>>
>> [ Upstream commit d7b64041164ca177170191d2ad775da074ab2926 ]
>>
>> A recent multithreaded write data corruption has been uncovered in
>> the iomap write code. The core of the problem is partial folio
>> writes can be flushed to disk while a new racing write can map it
>> and fill the rest of the page:
>>
>> writeback			new write
>>
>> allocate blocks
>>   blocks are unwritten
>> submit IO
>> .....
>> 				map blocks
>> 				iomap indicates UNWRITTEN range
>> 				loop {
>> 				  lock folio
>> 				  copyin data
>> .....
>> IO completes
>>   runs unwritten extent conv
>>     blocks are marked written
>> 				  <iomap now stale>
>> 				  get next folio
>> 				}
>>
>> Now add memory pressure such that memory reclaim evicts the
>> partially written folio that has already been written to disk.
>>
>> When the new write finally gets to the last partial page of the new
>> write, it does not find it in cache, so it instantiates a new page,
>> sees the iomap is unwritten, and zeros the part of the page that
>> it does not have data from. This overwrites the data on disk that
>> was originally written.
>>
>> The full description of the corruption mechanism can be found here:
>>
>> https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/
>>
>> To solve this problem, we need to check whether the iomap is still
>> valid after we lock each folio during the write. We have to do it
>> after we lock the page so that we don't end up with state changes
>> occurring while we wait for the folio to be locked.
>>
>> Hence we need a mechanism to be able to check that the cached iomap
>> is still valid (similar to what we already do in buffered
>> writeback), and we need a way for ->begin_write to back out and
>> tell the high level iomap iterator that we need to remap the
>> remaining write range.
>>
>> The iomap needs to grow some storage for the validity cookie that
>> the filesystem provides to travel with the iomap. XFS, in
>> particular, also needs to know some more information about what the
>> iomap maps (attribute extents rather than file data extents) to for
>> the validity cookie to cover all the types of iomaps we might need
>> to validate.
>>
>> Signed-off-by: Dave Chinner <dchinner@...hat.com>
>> Reviewed-by: Christoph Hellwig <hch@....de>
>> Reviewed-by: Darrick J. Wong <djwong@...nel.org>
>> Signed-off-by: Sasha Levin <sashal@...nel.org>
>
>This commit is not a standalone backport candidate. It is a pure
>infrastructure change that does nothing by itself except to add more
>code that won't get executed. There are another 7-8 patches that
>need to be backported along with this patch to fix the data
>corruption that is mentioned in this commit.
>
>I'd stronly suggest that you leave this whole series of commits to
>the XFS LTS maintainers to backport if they so choose to - randomly
>backporting commits from the middle of the series only makes their
>job more complex....

Ack, I'll drop it, thanks!

-- 
Thanks,
Sasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ