lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yp4gorgjhh6c3qeopjabmknimeifhnpbz63irrrtjpplatnk4k@ycofoucc4ry3>
Date: Wed, 5 Nov 2025 11:14:01 +0100
From: Jan Kara <jack@...e.cz>
To: libaokun@...weicloud.com
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, 
	jack@...e.cz, linux-kernel@...r.kernel.org, kernel@...kajraghav.com, 
	mcgrof@...nel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, 
	yi.zhang@...wei.com, yangerkun@...wei.com, chengzhihao1@...wei.com, 
	libaokun1@...wei.com
Subject: Re: [PATCH 25/25] ext4: enable block size larger than page size

On Sat 25-10-25 11:22:21, libaokun@...weicloud.com wrote:
> From: Baokun Li <libaokun1@...wei.com>
> 
> Since block device (See commit 3c20917120ce ("block/bdev: enable large
> folio support for large logical block sizes")) and page cache (See commit
> ab95d23bab220ef8 ("filemap: allocate mapping_min_order folios in the page
> cache")) has the ability to have a minimum order when allocating folio,
> and ext4 has supported large folio in commit 7ac67301e82f ("ext4: enable
> large folio for regular file"), now add support for block_size > PAGE_SIZE
> in ext4.
> 
> set_blocksize() -> bdev_validate_blocksize() already validates the block
> size, so ext4_load_super() does not need to perform additional checks.
> 
> Here we only need to enable large folio by default when s_min_folio_order
> is greater than 0 and add the FS_LBS bit to fs_flags.
> 
> In addition, mark this feature as experimental.
> 
> Signed-off-by: Baokun Li <libaokun1@...wei.com>
> Reviewed-by: Zhang Yi <yi.zhang@...wei.com>

...

> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 04f9380d4211..ba6cf05860ae 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5146,6 +5146,9 @@ static bool ext4_should_enable_large_folio(struct inode *inode)
>  	if (!ext4_test_mount_flag(sb, EXT4_MF_LARGE_FOLIO))
>  		return false;
>  
> +	if (EXT4_SB(sb)->s_min_folio_order)
> +		return true;
> +

But now files with data journalling flag enabled will get large folios
possibly significantly greater that blocksize. I don't think there's a
fundamental reason why data journalling doesn't work with large folios, the
only thing that's likely going to break is that credit estimates will go
through the roof if there are too many blocks per folio. But that can be
handled by setting max folio order to be equal to min folio order when
journalling data for the inode.

It is a bit scary to be modifying max folio order in
ext4_change_inode_journal_flag() but I guess less scary than setting new
aops and if we prune the whole page cache before touching the order and
inode flag, we should be safe (famous last words ;).

								Honza

>  	if (!S_ISREG(inode->i_mode))
>  		return false;
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_JOURNAL_DATA))
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index fdc006a973aa..4c0bd79bdf68 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5053,6 +5053,9 @@ static int ext4_check_large_folio(struct super_block *sb)
>  		return -EINVAL;
>  	}
>  
> +	if (sb->s_blocksize > PAGE_SIZE)
> +		ext4_msg(sb, KERN_NOTICE, "EXPERIMENTAL bs(%lu) > ps(%lu) enabled.",
> +			 sb->s_blocksize, PAGE_SIZE);
>  	return 0;
>  }
>  
> @@ -7432,7 +7435,8 @@ static struct file_system_type ext4_fs_type = {
>  	.init_fs_context	= ext4_init_fs_context,
>  	.parameters		= ext4_param_specs,
>  	.kill_sb		= ext4_kill_sb,
> -	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME,
> +	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME |
> +				  FS_LBS,
>  };
>  MODULE_ALIAS_FS("ext4");
>  
> -- 
> 2.46.1
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ