lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBsihOYrMCILT2cI@kernel.org>
Date:   Wed, 22 Mar 2023 12:45:08 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Jiri Olsa <olsajiri@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Hao Luo <haoluo@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, bpf@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...omium.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Namhyung Kim <namhyung@...il.com>,
        Dave Chinner <david@...morbit.com>
Subject: Re: [PATCHv3 bpf-next 0/9] mm/bpf/perf: Store build id in file object

Em Sat, Mar 18, 2023 at 03:16:45PM +0000, Matthew Wilcox escreveu:
> On Sat, Mar 18, 2023 at 09:33:49AM +0100, Jiri Olsa wrote:
> > On Thu, Mar 16, 2023 at 05:34:41PM +0000, Matthew Wilcox wrote:
> > > On Thu, Mar 16, 2023 at 06:01:40PM +0100, Jiri Olsa wrote:
> > > > hi,
> > > > this patchset adds build id object pointer to struct file object.
> > > > 
> > > > We have several use cases for build id to be used in BPF programs
> > > > [2][3].
> > > 
> > > Yes, you have use cases, but you never answered the question I asked:
> > > 
> > > Is this going to be enabled by every distro kernel, or is it for special
> > > use-cases where only people doing a very specialised thing who are
> > > willing to build their own kernels will use it?
> > 
> > I hope so, but I guess only time tell.. given the response by Ian and Andrii
> > there are 3 big users already
> 
> So the whole "There's a config option to turn it off" shtick is just a
> fig-leaf.  I won't ever see it turned off.  You're imposing the cost of
> this on EVERYONE who runs a distro kernel.  And almost nobody will see
> any benefits from it.  Thanks for admitting that.

I agree that build-ids are not useful for all 'struct file' uses, just
for executable files and for people wanting to have better observability
capabilities.

Having said that, it seems there will be no extra memory overhead at
least for a fedora:36 x86_64 kernel:

void __init files_init(void)
{
        filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
                        SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT, NULL);
        percpu_counter_init(&nr_files, 0, GFP_KERNEL);
}

[root@...co ~]# pahole file | grep size: -A2
	/* size: 232, cachelines: 4, members: 20 */
	/* sum members: 228, holes: 1, sum holes: 4 */
	/* last cacheline: 40 bytes */
[acme@...co perf-tools]$ uname -a
Linux quaco 6.1.11-100.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb  9 20:36:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@...co ~]# head -2 /proc/slabinfo 
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
[root@...co ~]# grep -w filp /proc/slabinfo 
filp               12452  13056    256   32    2 : tunables    0    0    0 : slabdata    408    408      0
[root@...co ~]#

so there are 24 bytes on the 4th cacheline that are not being used,
right?

One other observation is that maybe we could do it as the 'struct sock'
hierachy in networking, where we would have a 'struct exec_file' that
would be:

	struct exec_file {
		struct file file;
		char build_id[20];
	}

say, and then when we create the 'struct file' in __alloc_file() we
could check some bit in 'flags' like Al Viro suggested and pick a
different slab than 'filp_cachep', that has that extra space for the
build_id (and whatever else exec related state we may end up wanting, if
ever).

No core fs will need to know about that except when we go free it, to
free from the right slab cache.

In current distro configs, no overhead would take place if I read that
SLAB_HWCACHE_ALIGN thing right, no?

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ