[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aC5LA4bExl8rMRv0@quatroqueijos.cascardo.eti.br>
Date: Wed, 21 May 2025 18:52:03 -0300
From: Thadeu Lima de Souza Cascardo <cascardo@...lia.com>
To: Theodore Ts'o <tytso@....edu>
Cc: Jan Kara <jack@...e.com>, Tao Ma <boyu.mt@...bao.com>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Eric Biggers <ebiggers@...gle.com>, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-dev@...lia.com,
syzbot+0c89d865531d053abb2d@...kaller.appspotmail.com,
stable@...r.kernel.org
Subject: Re: [PATCH] ext4: inline: do not convert when writing to memory map
On Tue, May 20, 2025 at 10:57:08AM -0400, Theodore Ts'o wrote:
> On Mon, May 19, 2025 at 07:42:46AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > inline data handling has a race between writing and writing to a memory
> > map.
> >
> > When ext4_page_mkwrite is called, it calls ext4_convert_inline_data, which
> > destroys the inline data, but if block allocation fails, restores the
> > inline data. In that process, we could have:
> >
> > CPU1 CPU2
> > destroy_inline_data
> > write_begin (does not see inline data)
> > restory_inline_data
> > write_end (sees inline data)
> >
> > The conversion inside ext4_page_mkwrite was introduced at commit
> > 7b4cc9787fe3 ("ext4: evict inline data when writing to memory map"). This
> > fixes a documented bug in the commit message, which suggests some
> > alternatives fixes.
>
> Your fix just reverts commit 7b4cc9787fe3, and removes the BUG_ON.
> While this is great for shutting up the syzbot report, but it causes
> file writes to an inline data file via a mmap to never get written
> back to the storage device. So you are replacing BUG_ON that can get
> triggered on a race condition in case of a failed block allocation,
> with silent data corruption. This is not an improvement.
>
> Thanks for trying to address this, but I'm not going to accept your
> proposed fix.
>
> - Ted
Hi, Ted.
I am trying to understand better the circumstances where the data loss
might occur with the fix, but might not occur without the fix. Or, even if
they occur either way, such that I can work on a better/proper fix.
Right now, if ext4_convert_inline_data (called from ext4_page_mkwrite)
fails with ENOSPC, the memory access will lead to a SIGBUS. The same will
happen without the fix, if there are no blocks available.
Now, without ext4_convert_inline_data, blocks will be allocated by
ext4_page_mkwrite and written by ext4_do_writepages. Are you concerned
about a failure between the clearing of the inode data and the writing of
the block in ext4_do_writepages?
Or are you concerned about a potential race condition when allocating
blocks?
Which of these cannot happen today with the code as is? If I understand
correctly, the inline conversion code also calls ext4_destroy_inline_data
before allocating and writing to blocks.
Thanks a lot for the review and guidance.
Cascardo.
Powered by blists - more mailing lists