[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <x72nqrebxd5ng6gfxkmod2arzoplm27su2lscyhk6pp2e54zmm@nvnfxzr4lxss>
Date: Mon, 10 Nov 2025 11:47:08 +0100
From: Jan Kara <jack@...e.cz>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] fs: move inode fields used during fast path lookup
closer together
On Sun 09-11-25 13:19:31, Mateusz Guzik wrote:
> This should avoid *some* cache misses.
>
> Successful path lookup is guaranteed to load at least ->i_mode,
> ->i_opflags and ->i_acl. At the same time the common case will avoid
> looking at more fields.
>
> struct inode is not guaranteed to have any particular alignment, notably
> ext4 has it only aligned to 8 bytes meaning nearby fields might happen
> to be on the same or only adjacent cache lines depending on luck (or no
> luck).
>
> According to pahole:
> umode_t i_mode; /* 0 2 */
> short unsigned int i_opflags; /* 2 2 */
> kuid_t i_uid; /* 4 4 */
> kgid_t i_gid; /* 8 4 */
> unsigned int i_flags; /* 12 4 */
> struct posix_acl * i_acl; /* 16 8 */
> struct posix_acl * i_default_acl; /* 24 8 */
>
> ->i_acl is unnecessarily separated by 8 bytes from the other fields.
> With struct inode being offset 48 bytes into the cacheline this means an
> avoidable miss. Note it will still be there for the 56 byte case.
>
> New layout:
> umode_t i_mode; /* 0 2 */
> short unsigned int i_opflags; /* 2 2 */
> unsigned int i_flags; /* 4 4 */
> struct posix_acl * i_acl; /* 8 8 */
> struct posix_acl * i_default_acl; /* 16 8 */
> kuid_t i_uid; /* 24 4 */
> kgid_t i_gid; /* 28 4 */
>
> I verified with pahole there are no size or hole changes.
>
> This is stopgap until someone(tm) sanitizes the layout in the first
> place, allocation methods aside.
>
> Signed-off-by: Mateusz Guzik <mjguzik@...il.com>
Looks sensible. Feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
> ---
>
> > Successful path lookup is guaranteed to load at least ->i_mode,
> > ->i_opflags and ->i_acl. At the same time the common case will avoid
> > looking at more fields.
>
> While this is readily apparent with my patch to add dedicated MAY_EXEC
> handling, this is already true for the stock kernel.
>
> include/linux/fs.h | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index bd0740e3bfcb..314a1349747b 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -790,14 +790,13 @@ struct inode_state_flags {
> struct inode {
> umode_t i_mode;
> unsigned short i_opflags;
> - kuid_t i_uid;
> - kgid_t i_gid;
> unsigned int i_flags;
> -
> #ifdef CONFIG_FS_POSIX_ACL
> struct posix_acl *i_acl;
> struct posix_acl *i_default_acl;
> #endif
> + kuid_t i_uid;
> + kgid_t i_gid;
>
> const struct inode_operations *i_op;
> struct super_block *i_sb;
> --
> 2.48.1
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists