[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <983bb802-d883-18d4-7945-dbfa209c1cc8@linaro.org>
Date: Fri, 20 May 2022 07:50:20 -0700
From: Tadeusz Struk <tadeusz.struk@...aro.org>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, lkml <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org
Subject: Re: kernel BUG in ext4_writepages
On 5/20/22 02:50, Jan Kara wrote:
> On Thu 19-05-22 16:14:17, Tadeusz Struk wrote:
>> On 5/19/22 05:23, Jan Kara wrote:
>>> Hi!
>>>
>>> On Tue 10-05-22 15:28:38, Tadeusz Struk wrote:
>>>> Syzbot found another BUG in ext4_writepages [1].
>>>> This time it complains about inode with inline data.
>>>> C reproducer can be found here [2]
>>>> I was able to trigger it on 5.18.0-rc6
>>>>
>>>> [1] https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
>>>> [2] https://syzkaller.appspot.com/text?tag=ReproC&x=129da6caf00000
>>>
>>> Thanks for report. This should be fixed by:
>>>
>>> https://lore.kernel.org/all/20220516012752.17241-1-yebin10@huawei.com/
>>
>>
>> In case of the syzbot bug there is something messed up with PAGE DIRTY flags
>> and the way syzbot sets up the write. This is what triggers the crash:
>
> Can you tell me where exactly we hit the bug? I've now noticed that this is
> on 5.10 kernel and on vanilla 5.10 there's no BUG_ON on line 2753.
We are hiting this bug:
https://elixir.bootlin.com/linux/latest/source/fs/ext4/inode.c#L2707
Syzbot found it in v5.10, but I recreated it on 5.18-rc7, that's why
the line number mismatch. But this is the same bug.
On 5.10 it's in line 2739:
https://elixir.bootlin.com/linux/v5.10.117/source/fs/ext4/inode.c#L2739
>
>> $ ftrace -f ./repro
>> ...
>> [pid 2395] open("./bus", O_RDWR|O_CREAT|O_SYNC|O_NOATIME, 000 <unfinished ...>
>> [pid 2395] <... open resumed> ) = 6
>> ...
>> [pid 2395] write(6, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 22 <unfinished ...>
>> ...
>> [pid 2395] <... write resumed> ) = 22
>>
>> One way I could fix it was to clear the PAGECACHE_TAG_DIRTY on the mapping in
>> ext4_try_to_write_inline_data() after the page has been updated:
>>
>> diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
>> index 9c076262770d..e4bbb53fa26f 100644
>> --- a/fs/ext4/inline.c
>> +++ b/fs/ext4/inline.c
>> @@ -715,6 +715,7 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
>> put_page(page);
>> goto out_up_read;
>> }
>> + __xa_clear_mark(&mapping->i_pages, 0, PAGECACHE_TAG_DIRTY);
>> }
>> ret = 1;
>>
>> Please let me know it if makes sense any I will send a proper patch.
>
> No, this looks really wrong... We need to better understand what's going
> on.
So I was afraid. I'm trying to diverge the ext4_writepages() to go to the
out_writepages path before we hit this BOG_ON().
Any hints will be much appreciated.
--
Thanks,
Tadeusz
Powered by blists - more mailing lists