lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241115111914.qhrwe4mek6quthko@quack3>
Date: Fri, 15 Nov 2024 12:19:14 +0100
From: Jan Kara <jack@...e.cz>
To: Song Liu <songliubraving@...a.com>
Cc: Christian Brauner <brauner@...nel.org>, Song Liu <song@...nel.org>,
	"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-security-module@...r.kernel.org" <linux-security-module@...r.kernel.org>,
	Kernel Team <kernel-team@...a.com>,
	"andrii@...nel.org" <andrii@...nel.org>,
	"eddyz87@...il.com" <eddyz87@...il.com>,
	"ast@...nel.org" <ast@...nel.org>,
	"daniel@...earbox.net" <daniel@...earbox.net>,
	"martin.lau@...ux.dev" <martin.lau@...ux.dev>,
	"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
	"jack@...e.cz" <jack@...e.cz>,
	"kpsingh@...nel.org" <kpsingh@...nel.org>,
	"mattbobrowski@...gle.com" <mattbobrowski@...gle.com>,
	"amir73il@...il.com" <amir73il@...il.com>,
	"repnop@...gle.com" <repnop@...gle.com>,
	"jlayton@...nel.org" <jlayton@...nel.org>,
	Josef Bacik <josef@...icpanda.com>,
	"mic@...ikod.net" <mic@...ikod.net>,
	"gnoack@...gle.com" <gnoack@...gle.com>
Subject: Re: [PATCH bpf-next 2/4] bpf: Make bpf inode storage available to
 tracing program

Hi Song!

On Thu 14-11-24 21:11:57, Song Liu wrote:
> > On Nov 13, 2024, at 2:19 AM, Christian Brauner <brauner@...nel.org> wrote:
> >> static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> >>   bpf_func_t *bpf_func)
> >> {
> >> diff --git a/include/linux/fs.h b/include/linux/fs.h
> >> index 3559446279c1..479097e4dd5b 100644
> >> --- a/include/linux/fs.h
> >> +++ b/include/linux/fs.h
> >> @@ -79,6 +79,7 @@ struct fs_context;
> >> struct fs_parameter_spec;
> >> struct fileattr;
> >> struct iomap_ops;
> >> +struct bpf_local_storage;
> >> 
> >> extern void __init inode_init(void);
> >> extern void __init inode_init_early(void);
> >> @@ -648,6 +649,9 @@ struct inode {
> >> #ifdef CONFIG_SECURITY
> >> void *i_security;
> >> #endif
> >> +#ifdef CONFIG_BPF_SYSCALL
> >> + struct bpf_local_storage __rcu *i_bpf_storage;
> >> +#endif
> > 
> > Sorry, we're not growing struct inode for this. It just keeps getting
> > bigger. Last cycle we freed up 8 bytes to shrink it and we're not going
> > to waste them on special-purpose stuff. We already NAKed someone else's
> > pet field here.
> 
> Per other discussions in this thread, I am implementing the following:
> 
> #ifdef CONFIG_SECURITY
>         void                    *i_security;
> #elif CONFIG_BPF_SYSCALL
>         struct bpf_local_storage __rcu *i_bpf_storage;
> #endif
> 
> However, it is a bit trickier than I thought. Specifically, we need 
> to deal with the following scenarios:
>  
> 1. CONFIG_SECURITY=y && CONFIG_BPF_LSM=n && CONFIG_BPF_SYSCALL=y
> 2. CONFIG_SECURITY=y && CONFIG_BPF_LSM=y && CONFIG_BPF_SYSCALL=y but 
>    bpf lsm is not enabled at boot time. 
> 
> AFAICT, we need to modify how lsm blob are managed with 
> CONFIG_BPF_SYSCALL=y && CONFIG_BPF_LSM=n case. The solution, even
> if it gets accepted, doesn't really save any memory. Instead of 
> growing struct inode by 8 bytes, the solution will allocate 8
> more bytes to inode->i_security. So the total memory consumption
> is the same, but the memory is more fragmented. 

I guess you've found a better solution for this based on James' suggestion.

> Therefore, I think we should really step back and consider adding
> the i_bpf_storage to struct inode. While this does increase the
> size of struct inode by 8 bytes, it may end up with less overall
> memory consumption for the system. This is why. 
>
> When the user cannot use inode local storage, the alternative is 
> to use hash maps (use inode pointer as key). AFAICT, all hash maps 
> comes with non-trivial overhead, in memory consumption, in access 
> latency, and in extra code to manage the memory. OTOH, inode local 
> storage doesn't have these issue, and is usually much more efficient: 
>  - memory is only allocated for inodes with actual data, 
>  - O(1) latency, 
>  - per inode data is freed automatically when the inode is evicted. 
> Please refer to [1] where Amir mentioned all the work needed to 
> properly manage a hash map, and I explained why we don't need to 
> worry about these with inode local storage. 

Well, but here you are speaking of a situation where bpf inode storage
space gets actually used for most inodes. Then I agree i_bpf_storage is the
most economic solution. But I'd also expect that for vast majority of
systems the bpf inode storage isn't used at all and if it does get used, it
is used only for a small fraction of inodes. So we are weighting 8 bytes
per inode for all those users that don't need it against more significant
memory savings for users that actually do need per inode bpf storage. A
factor in this is that a lot of people are running some distribution kernel
which generally enables most config options that are at least somewhat
useful. So hiding the cost behind CONFIG_FOO doesn't really help such
people.
 
I'm personally not *so* hung up about a pointer in struct inode but I can
see why Christian is and I agree adding a pointer there isn't a win for
everybody.

Longer term, I think it may be beneficial to come up with a way to attach
private info to the inode in a way that doesn't cost us one pointer per
funcionality that may possibly attach info to the inode. We already have
i_crypt_info, i_verity_info, i_flctx, i_security, etc. It's always a tough
call where the space overhead for everybody is worth the runtime &
complexity overhead for users using the functionality...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ