lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjbSRL7LM7CvckB+goQdUokMa_6G-iirdbtxrFSFe3mfA@mail.gmail.com>
Date: Thu, 27 Mar 2025 12:40:09 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Stephen Smalley <stephen.smalley.work@...il.com>
Cc: Jeffrey Vander Stoep <jeffv@...gle.com>, ThiƩbaud Weksteen <tweek@...gle.com>, 
	Paul Moore <paul@...l-moore.com>, "Cameron K. Williams" <ckwilliams.work@...il.com>, 
	"Kipp N. Davis" <kippndavis.work@....com>, selinux@...r.kernel.org, 
	linux-security-module@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Nick Kralevich <nnk@...gle.com>, Kees Cook <kees@...nel.org>
Subject: Re: [GIT PULL] selinux/selinux-pr-20250323

On Thu, 27 Mar 2025 at 12:16, Stephen Smalley
<stephen.smalley.work@...il.com> wrote:
>
> Where could/would we cache that information so that it was accessible
> directly by the VFS layer?

So the VFS layer already does this for various other things. For this
case, the natural thing to do would be to add another IOP_xyzzy flag
in inode->i_opflags.

That's how we already say things like "this inode has no
filesystem-specific i_op->permission function" (IOP_FASTPERM), so that
we don't even have to follow the "inode->i_op->permission" pointer
chain to see a NULL pointer.

Yes, the VFS layer is *heavily* optimized like that. It literally does
that IOP_FASTPERM to avoid chasing two pointers - not even the call,
just the "don't even bother to follow pointers to see if it's NULL".
See do_inode_permission().

And we have 16 bits in that inode->i_opflags, and currently only use 7
of those bits. Adding one bit for a IOP_NO_SECURITY_LOOKUP kind of
logic (feel free to rename that - just throwing a random name out as a
suggestion) would be a complete no-brainer.

NOTE! The rule for the i_opflags accesses is that *reading* them is
done with no locking at all, but changing them takes the inode
spinlock (and we should technically probably use WRITE_ONCE() and
READ_ONCE(), but we don't).

And notice that the "no locking at all for reading" means that if you
*change* the bit - even though that involves locking - there may be
concurrent lookups in process that won't see the change, and would go
on as if the lookup still does not need any security layer call. No
serialization to readers at all (although you could wait for an RCU
period after changing if you really need to, and only use the bit in
the RCU lookup).

That should be perfectly fine - I really don't think serialization is
even needed. If somebody is changing the policy rules, any file
lookups *concurrent* to that change might not see the new rules, but
that's the same as if it happened before the change.

I just wanted to point out that the serialization is unbalanced: the
spinlock for changing the flag is literally just to make sure that two
bits being changed at the same time don't stomp on each other (because
it's a 16-bit non-atomic field, and we didn't want to use a "unsigned
long" and atomic bitops because the cache layout of the inode is also
a big issue).

And you can take the IOP_FASTPERM thing as an example of how to do
this: it is left clear initially, and what happens is that during the
permission lookup, if it *isn't* set, we'll follow those
inode->i_io->permission pointers, and notice that we should set it:

        if (unlikely(!(inode->i_opflags & IOP_FASTPERM))) {
                if (likely(inode->i_op->permission))
                        return inode->i_op->permission(idmap, inode, mask);

                /* This gets set once for the inode lifetime */
                spin_lock(&inode->i_lock);
                inode->i_opflags |= IOP_FASTPERM;
                spin_unlock(&inode->i_lock);
        }

and I think the security layer could take a very similar approach: not
setting that IOP_NO_SECURITY_LOOKUP initially, but *when* a
security_inode_permission() call is made with just MAY_NOT_BLOCK |
MAY_LOOKUP, and the security layer notices that "this inode has no
reason to care", it could set the bit so that *next* time around the
VFS layer won't bother to call into security_inode_permission()
unnecessarily.

Does that clarify?

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ