[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL3q7H4k3Dj9gQQhBz_eadnRUaWaNPaf7+MYucshY6cis2by5Q@mail.gmail.com>
Date: Fri, 9 Jan 2026 12:43:27 +0000
From: Filipe Manana <fdmanana@...nel.org>
To: Edward Adam Davis <eadavis@...com>
Cc: syzbot+b4a2af3000eaa84d95d5@...kaller.appspotmail.com, clm@...com,
dsterba@...e.com, josef@...icpanda.com, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Re: [PATCH] btrfs: Sync read disk supper and set block size
On Fri, Jan 9, 2026 at 11:42 AM Edward Adam Davis <eadavis@...com> wrote:
>
> When the user performs a btrfs mount, the block device is not set
> correctly. The user sets the block size of the block device to 0x4000
> by executing the BLKBSZSET command.
> Since the block size change also changes the mapping->flags value, this
> further affects the result of the mapping_min_folio_order() calculation.
>
> Let's analyze the following two scenarios:
> Scenario 1: Without executing the BLKBSZSET command, the block size is
> 0x1000, and mapping_min_folio_order() returns 0;
>
> Scenario 2: After executing the BLKBSZSET command, the block size is
> 0x4000, and mapping_min_folio_order() returns 2.
>
> do_read_cache_folio() allocates a folio before the BLKBSZSET command
> is executed. This results in the allocated folio having an order value
> of 0. Later, after BLKBSZSET is executed, the block size increases to
> 0x4000, and the mapping_min_folio_order() calculation result becomes 2.
> This leads to two undesirable consequences:
> 1. filemap_add_folio() triggers a VM_BUG_ON_FOLIO(folio_order(folio) <
> mapping_min_folio_order(mapping)) assertion.
> 2. The syzbot report [1] shows a null pointer dereference in
> create_empty_buffers() due to a buffer head allocation failure.
>
> Synchronization should be established based on the inode between the
> BLKBSZSET command and read cache page to prevent inconsistencies in
> block size or mapping flags before and after folio allocation.
While that fixes the problem, I'd rather lock the invalidation lock
instead of the inode:
filemap_invalidate_lock(mapping);
page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT, GFP_NOFS);
filemap_invalidate_unlock(mapping);
Because that's what the documentation for read_cache_page_gfp() asks for:
/**
(...)
*
* The function expects mapping->invalidate_lock to be already held.
(...)
*/
struct page *read_cache_page_gfp(struct address_space *mapping,
pgoff_t index,
gfp_t gfp)
{
return do_read_cache_page(mapping, index, NULL, NULL, gfp);
}
That will fix the race with set_blocksize() too.
Thanks.
>
> [1]
> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> RIP: 0010:create_empty_buffers+0x4d/0x480 fs/buffer.c:1694
> Call Trace:
> folio_create_buffers+0x109/0x150 fs/buffer.c:1802
> block_read_full_folio+0x14c/0x850 fs/buffer.c:2403
> filemap_read_folio+0xc8/0x2a0 mm/filemap.c:2496
> do_read_cache_folio+0x266/0x5c0 mm/filemap.c:4096
> do_read_cache_page mm/filemap.c:4162 [inline]
> read_cache_page_gfp+0x29/0x120 mm/filemap.c:4195
> btrfs_read_disk_super+0x192/0x500 fs/btrfs/volumes.c:1367
>
> Reported-by: syzbot+b4a2af3000eaa84d95d5@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b4a2af3000eaa84d95d5
> Tested-by: syzbot+b4a2af3000eaa84d95d5@...kaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@...com>
> ---
> fs/btrfs/volumes.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 13c514684cfb..eee7471a3e03 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1339,6 +1339,7 @@ struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
> struct page *page;
> u64 bytenr, bytenr_orig;
> struct address_space *mapping = bdev->bd_mapping;
> + struct inode *inode = mapping->host;
> int ret;
>
> bytenr_orig = btrfs_sb_offset(copy_num);
> @@ -1364,7 +1365,9 @@ struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
> (bytenr + BTRFS_SUPER_INFO_SIZE) >> PAGE_SHIFT);
> }
>
> + inode_lock(inode);
> page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT, GFP_NOFS);
> + inode_unlock(inode);
> if (IS_ERR(page))
> return ERR_CAST(page);
>
> --
> 2.43.0
>
>
Powered by blists - more mailing lists