lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <x72nqrebxd5ng6gfxkmod2arzoplm27su2lscyhk6pp2e54zmm@nvnfxzr4lxss>
Date: Mon, 10 Nov 2025 11:47:08 +0100
From: Jan Kara <jack@...e.cz>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz, 
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] fs: move inode fields used during fast path lookup
 closer together

On Sun 09-11-25 13:19:31, Mateusz Guzik wrote:
> This should avoid *some* cache misses.
> 
> Successful path lookup is guaranteed to load at least ->i_mode,
> ->i_opflags and ->i_acl. At the same time the common case will avoid
> looking at more fields.
> 
> struct inode is not guaranteed to have any particular alignment, notably
> ext4 has it only aligned to 8 bytes meaning nearby fields might happen
> to be on the same or only adjacent cache lines depending on luck (or no
> luck).
> 
> According to pahole:
>         umode_t                    i_mode;               /*     0     2 */
>         short unsigned int         i_opflags;            /*     2     2 */
>         kuid_t                     i_uid;                /*     4     4 */
>         kgid_t                     i_gid;                /*     8     4 */
>         unsigned int               i_flags;              /*    12     4 */
>         struct posix_acl *         i_acl;                /*    16     8 */
>         struct posix_acl *         i_default_acl;        /*    24     8 */
> 
> ->i_acl is unnecessarily separated by 8 bytes from the other fields.
> With struct inode being offset 48 bytes into the cacheline this means an
> avoidable miss. Note it will still be there for the 56 byte case.
> 
> New layout:
>         umode_t                    i_mode;               /*     0     2 */
>         short unsigned int         i_opflags;            /*     2     2 */
>         unsigned int               i_flags;              /*     4     4 */
>         struct posix_acl *         i_acl;                /*     8     8 */
>         struct posix_acl *         i_default_acl;        /*    16     8 */
>         kuid_t                     i_uid;                /*    24     4 */
>         kgid_t                     i_gid;                /*    28     4 */
> 
> I verified with pahole there are no size or hole changes.
> 
> This is stopgap until someone(tm) sanitizes the layout in the first
> place, allocation methods aside.
> 
> Signed-off-by: Mateusz Guzik <mjguzik@...il.com>

Looks sensible. Feel free to add:

Reviewed-by: Jan Kara <jack@...e.cz>

								Honza

> ---
> 
> > Successful path lookup is guaranteed to load at least ->i_mode,
> > ->i_opflags and ->i_acl. At the same time the common case will avoid
> > looking at more fields.
> 
> While this is readily apparent with my patch to add dedicated MAY_EXEC
> handling, this is already true for the stock kernel.
> 
>  include/linux/fs.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index bd0740e3bfcb..314a1349747b 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -790,14 +790,13 @@ struct inode_state_flags {
>  struct inode {
>  	umode_t			i_mode;
>  	unsigned short		i_opflags;
> -	kuid_t			i_uid;
> -	kgid_t			i_gid;
>  	unsigned int		i_flags;
> -
>  #ifdef CONFIG_FS_POSIX_ACL
>  	struct posix_acl	*i_acl;
>  	struct posix_acl	*i_default_acl;
>  #endif
> +	kuid_t			i_uid;
> +	kgid_t			i_gid;
>  
>  	const struct inode_operations	*i_op;
>  	struct super_block	*i_sb;
> -- 
> 2.48.1
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ