lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <EEEFDE10-07B5-47B5-BE6E-DAC2FE2B7576@fb.com>
Date: Thu, 19 Dec 2024 03:03:46 +0000
From: Song Liu <songliubraving@...a.com>
To: Dave Chinner <david@...morbit.com>
CC: Song Liu <song@...nel.org>,
        "linux-fsdevel@...r.kernel.org"
	<linux-fsdevel@...r.kernel.org>,
        "linux-doc@...r.kernel.org"
	<linux-doc@...r.kernel.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "linux-btrfs@...r.kernel.org"
	<linux-btrfs@...r.kernel.org>,
        "linux-xfs@...r.kernel.org"
	<linux-xfs@...r.kernel.org>,
        "linux-security-module@...r.kernel.org"
	<linux-security-module@...r.kernel.org>,
        "willy@...radead.org"
	<willy@...radead.org>,
        "corbet@....net" <corbet@....net>, Chris Mason
	<clm@...a.com>,
        Josef Bacik <josef@...icpanda.com>,
        "dsterba@...e.com"
	<dsterba@...e.com>,
        "brauner@...nel.org" <brauner@...nel.org>,
        "jack@...e.cz"
	<jack@...e.cz>, "cem@...nel.org" <cem@...nel.org>,
        "djwong@...nel.org"
	<djwong@...nel.org>,
        "paul@...l-moore.com" <paul@...l-moore.com>,
        "jmorris@...ei.org" <jmorris@...ei.org>,
        "serge@...lyn.com"
	<serge@...lyn.com>,
        "fdmanana@...e.com" <fdmanana@...e.com>,
        "johannes.thumshirn@....com" <johannes.thumshirn@....com>,
        Kernel Team
	<kernel-team@...a.com>
Subject: Re: [RFC v2] lsm: fs: Use inode_free_callback to free i_security in
 RCU callback

Hi Dave, 

Thanks for your comments!

> On Dec 18, 2024, at 5:47 PM, Dave Chinner <david@...morbit.com> wrote:
> 
> On Wed, Dec 18, 2024 at 01:16:15PM -0800, Song Liu wrote:
>> inode->i_security needes to be freed from RCU callback. A rcu_head was
>> added to i_security to call the RCU callback. However, since struct inode
>> already has i_rcu, the extra rcu_head is wasteful. Specifically, when any
>> LSM uses i_security, a rcu_head (two pointers) is allocated for each
>> inode.
>> 
>> Rename i_callback to inode_free_callback and call security_inode_free_rcu
>> from it to free i_security so that a rcu_head is saved for each inode.
>> Special care are needed for file systems that provide a destroy_inode()
>> callback, but not a free_inode() callback. Specifically, the following
>> logic are added to handle such cases:
>> 
>> - XFS recycles inode after destroy_inode. The inodes are freed from
>>   recycle logic. Let xfs_inode_free_callback() call inode_free_callback.
> 
> NACK. That's a massive layering violation.
> 
> LSMs are VFS layer functionality. They *must* be removed from the
> VFS inode before ->destroy_inode() is called. If a filesystem
> supplies ->destroy_inode(), then it -owns- the inode exclusively
> from that point onwards. All VFS and 3rd party stuff hanging off the
> inode must be removed from the inode before ->destroy_inode() is
> called.

To be honest, I am not sure this rule is true. Most filesystems 
provide a free_inode() callback:

$ grep "\.free_inode.*=" fs/*/*.c | wc
     54     221    2808

For all these cases, VFS layer is still in charge of freeing the
inode after a RCU grace period. From vfs.rst:

``free_inode``
        this method is called from RCU callback. If you use call_rcu()
        in ->destroy_inode to free 'struct inode' memory, then it's
        better to release memory in this method.

> Them's the rules, and hacking around LSM object allocation/freeing
> to try to handle how filesystems manage inodes -underneath- the VFS
> is just asking for problems. LSM object management needs to be
> handled entirely by the generic LSM VFS hooks, not tightly coupled
> to VFS and/or low level filesystem inode lifecycle management.

Unfortunately, many filesystems already call various security_* 
hooks directly:

$ grep "include <linux/security.h>" fs/*/*.c -rI | wc
     46      92    2144
(46 files in various filesystems call into the security layer.)

Therefore, I don't feel this change alone is massive layering 
violation. Instead, it is probably making existing layering 
violation a bit worse. I personally think it is not making
things too worse. On the other hand, saving a rcu_head per 
inode is non-trivial savings. 

It worth note that many systems allocate memory on i_security, 
even the user is not using any LSM. This is because some LSMs 
can only be disabled at compile time. Distro kernels are more 
likely to enable them to fit various users' needs. Therefore, 
the i_security->rcu_head "tax" is not somebody else's problem. 
We probably just don't know we are paying it. 

That being said, I am not too attached to this change. If folks
think allocating an extra rcu_head per inode is the better 
trade-off. I am OK with that answer. I put "RFC" there for 
comments. 

I hope this makes sense. 

Thanks again for your comments. 

Song

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ