[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wh+pk72FM+a7PoW2s46aU9OQZrY-oApMZSUH0Urg9bsMA@mail.gmail.com>
Date: Sat, 12 Apr 2025 13:22:38 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Christian Brauner <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>, Jan Kara <jack@...e.cz>,
Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: generic_permission() optimization
On Sat, 12 Apr 2025 at 09:26, Mateusz Guzik <mjguzik@...il.com> wrote:
>
> I plopped your snippet towards the end of __ext4_iget:
That's literally where I did the same thing, except I put it right after the
brelse(iloc.bh);
line, rather than before as you did.
And it made no difference for me, but I didn't try to figure out why.
Maybe some environment differences? Or maybe I just screwed up my
testing...
As mentioned earlier in the thread, I had this bi-modal distribution
of results, because if I had a load where the *non*-owner of the inode
looked up the pathnames, then the ACL information would get filled in
when the VFS layer would do the lookup, and then once the ACLs were
cached, everything worked beautifully.
But if the only lookups of a path were done by the owner of the inodes
(which is typical for at least my normal kernel build tree - nothing
but my build will look at the files, and they are obviously always
owned by me) then the ACL caches will never be filled because there
will never be any real ACL lookups.
And then rather than doing the nice efficient "no ACLs anywhere, no
need to even look", it ends up having to actually do the vfsuid
comparison for the UID equality check.
Which then does the extra accesses to look up the idmap etc, and is
visible in the profiles due to that whole dance:
/* Are we the owner? If so, ACL's don't matter */
vfsuid = i_uid_into_vfsuid(idmap, inode);
if (likely(vfsuid_eq_kuid(vfsuid, current_fsuid()))) {
even when idmap is 'nop_mnt_idmap' and it is reasonably cheap. Just
because it ends up calling out to different functions and does extra
D$ accesses to the inode and the suberblock (ie i_user_ns() is this
return inode->i_sb->s_user_ns;
so just to *see* that it's nop_mnt_idmap takes effort.
One improvement might be to cache that 'nop_mnt_idmap' thing in the
inode as a flag.
But it would be even better if the filesystem just initializes the
inode at inode read time to say "I have no ACL's for this inode" and
none of this code will even trigger.
Linus
Powered by blists - more mailing lists