linux-ext4 - Re: [PATCH 0/1][For stable 5.4] mm: migrate: buffer_migrate_page_norefs() fallback migrate not uptodate pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <2023050612-thee-chafe-569c@gregkh>
Date:   Sat, 6 May 2023 09:58:41 +0900
From:   Greg KH <greg@...ah.com>
To:     Yue Zhao <findns94@...il.com>
Cc:     stable@...r.kernel.org, linux-ext4@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org, tytso@....edu, adilger.kernel@...ger.ca,
        jack@...e.cz, yi.zhang@...wei.com, tangyeechou@...il.com
Subject: Re: [PATCH 0/1][For stable 5.4] mm: migrate:
 buffer_migrate_page_norefs() fallback migrate not uptodate pages

On Thu, May 04, 2023 at 12:34:25AM +0800, Yue Zhao wrote:
> Recently we found a bug related with ext4 buffer head is fixed by
> commit 0b73284c564d("ext4: ext4_read_bh_lock() should submit IO if the
> buffer isn't uptodate")[1].
> 
> This bug is fixed on some kernel long term versions, such as 5.10 and 5.15.
> However, on 5.4 stable version, we can still easily reproduce this bug by
> adding some delay after buffer_migrate_lock_buffers() in __buffer_migrate_page()
> and do fsstress on the ext4 filesystem. We can get some errors in dmesg like:
> 
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193:
>   comm fsstress: reading directory lblock 0
>   EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334:
>   comm fsstress: reading directory lblock 0
> 
> About how to fix this bug in 5.4 version, currently I have three ideas.
> But I don't know which one is better or is there any other feasible way to
> fix this bug elegantly based on the 5.4 stable branch?
> 
> The first idea comes from this thread[2]. In __buffer_migrate_page(),
> we can let it fallback to migrate_page that are not uptodate like 
> fallback_migrate_page(), those pages that has buffers may probably do
> read operation soon. From [3], we can see this solution is not good enough
> because there are other places that lock the buffer without doing IO.
> I think this solution can be a candidate option to fix if we do not want to
> change a lot. Also based on my test results, the ext4 filesystem remains
> stable after one week stress test with this patch applied.
> 
> The second idea is backport a series of commits from upstream, such as
> 
>   2d069c0889ef ("ext4: use common helpers in all places reading metadata buffers")
>   0b73284c564d ("ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate")
>   79f597842069 ("fs/buffer: remove ll_rw_block() helper")

Backporting the original upstream commits is almost always the correct
solution.  Please try doing that instead of a one-off patch like this.

thanks,

greg k-h