[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <706568C6-43DB-4CEF-8BD8-A46BC3E5AC9C@fb.com>
Date: Thu, 21 Nov 2024 18:28:35 +0000
From: Song Liu <songliubraving@...a.com>
To: Casey Schaufler <casey@...aufler-ca.com>
CC: Song Liu <songliubraving@...a.com>, "Dr. Greg" <greg@...ellic.com>,
James
Bottomley <James.Bottomley@...senPartnership.com>,
"jack@...e.cz"
<jack@...e.cz>,
"brauner@...nel.org" <brauner@...nel.org>, Song Liu
<song@...nel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-security-module@...r.kernel.org"
<linux-security-module@...r.kernel.org>,
Kernel Team <kernel-team@...a.com>,
"andrii@...nel.org" <andrii@...nel.org>,
"eddyz87@...il.com"
<eddyz87@...il.com>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"martin.lau@...ux.dev"
<martin.lau@...ux.dev>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"kpsingh@...nel.org" <kpsingh@...nel.org>,
"mattbobrowski@...gle.com"
<mattbobrowski@...gle.com>,
"amir73il@...il.com" <amir73il@...il.com>,
"repnop@...gle.com" <repnop@...gle.com>,
"jlayton@...nel.org"
<jlayton@...nel.org>,
Josef Bacik <josef@...icpanda.com>,
"mic@...ikod.net"
<mic@...ikod.net>,
"gnoack@...gle.com" <gnoack@...gle.com>
Subject: Re: [PATCH bpf-next 0/4] Make inode storage available to tracing prog
> On Nov 21, 2024, at 9:47 AM, Casey Schaufler <casey@...aufler-ca.com> wrote:
>
> On 11/21/2024 12:28 AM, Song Liu wrote:
>> Hi Dr. Greg,
>>
>> Thanks for your input!
>>
>>> On Nov 20, 2024, at 8:54 AM, Dr. Greg <greg@...ellic.com> wrote:
>>>
>>> On Tue, Nov 19, 2024 at 10:14:29AM -0800, Casey Schaufler wrote:
>> [...]
>>
>>>>> 2.) Implement key/value mapping for inode specific storage.
>>>>>
>>>>> The key would be a sub-system specific numeric value that returns a
>>>>> pointer the sub-system uses to manage its inode specific memory for a
>>>>> particular inode.
>>>>>
>>>>> A participating sub-system in turn uses its identifier to register an
>>>>> inode specific pointer for its sub-system.
>>>>>
>>>>> This strategy loses O(1) lookup complexity but reduces total memory
>>>>> consumption and only imposes memory costs for inodes when a sub-system
>>>>> desires to use inode specific storage.
>>>> SELinux and Smack use an inode blob for every inode. The performance
>>>> regression boggles the mind. Not to mention the additional
>>>> complexity of managing the memory.
>>> I guess we would have to measure the performance impacts to understand
>>> their level of mind boggliness.
>>>
>>> My first thought is that we hear a huge amount of fanfare about BPF
>>> being a game changer for tracing and network monitoring. Given
>>> current networking speeds, if its ability to manage storage needed for
>>> it purposes are truely abysmal the industry wouldn't be finding the
>>> technology useful.
>>>
>>> Beyond that.
>>>
>>> As I noted above, the LSM could be an independent subscriber. The
>>> pointer to register would come from the the kmem_cache allocator as it
>>> does now, so that cost is idempotent with the current implementation.
>>> The pointer registration would also be a single instance cost.
>>>
>>> So the primary cost differential over the common arena model will be
>>> the complexity costs associated with lookups in a red/black tree, if
>>> we used the old IMA integrity cache as an example implementation.
>>>
>>> As I noted above, these per inode local storage structures are complex
>>> in of themselves, including lists and locks. If touching an inode
>>> involves locking and walking lists and the like it would seem that
>>> those performance impacts would quickly swamp an r/b lookup cost.
>> bpf local storage is designed to be an arena like solution that works
>> for multiple bpf maps (and we don't know how many of maps we need
>> ahead of time). Therefore, we may end up doing what you suggested
>> earlier: every LSM should use bpf inode storage. ;) I am only 90%
>> kidding.
>
> Sorry, but that's not funny.
I didn't think this is funny. Many use cases can seriously benefit
from a _reliable_ allocator for inode attached data.
> It's the kind of suggestion that some
> yoho takes seriously, whacks together a patch for, and gets accepted
> via the xfd887 device tree. Then everyone screams at the SELinux folks
> because of the performance impact. As I have already pointed out,
> there are serious consequences for an LSM that has a blob on every
> inode.
i_security serves this type of users pretty well. I see no reason
to change this. At the same time, I see no reasons to block
optimizations for other use cases because these users may get
blamed in 2087 for a mistake by xfd887 device maintainers.
Song
Powered by blists - more mailing lists