netdev - Re: [PATCH v2 bpf-next 1/3] bpf: add mmap() support for BPF_MAP_TYPE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAEf4Bzb5y3eEq4QU_3zdGJts4PeSVGr1YVmmWjTgs=LOf4PLKw@mail.gmail.com>
Date:   Tue, 12 Nov 2019 19:19:33 -0800
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <kernel-team@...com>,
        Rik van Riel <riel@...riel.com>,
        Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH v2 bpf-next 1/3] bpf: add mmap() support for BPF_MAP_TYPE_ARRAY

On Tue, Nov 12, 2019 at 2:38 PM Jakub Kicinski
<jakub.kicinski@...ronome.com> wrote:
>
> On Tue, 12 Nov 2019 14:03:50 -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 12, 2019 at 11:17 AM Jakub Kicinski wrote:
> > > On Mon, 11 Nov 2019 18:06:42 -0800, Andrii Nakryiko wrote:
> > > > So let's say if sizeof(struct bpf_array) is 300, then I'd have to either:
> > > >
> > > > - somehow make sure that I allocate 4k (for data) + 300 (for struct
> > > > bpf_array) in such a way that those 4k of data are 4k-aligned. Is
> > > > there any way to do that?
> > > > - assuming there isn't, then another way would be to allocate entire
> > > > 4k page for struct bpf_array itself, but put it at the end of that
> > > > page, so that 4k of data is 4k-aligned. While wasteful, the bigger
> > > > problem is that pointer to bpf_array is not a pointer to allocated
> > > > memory anymore, so we'd need to remember that and adjust address
> > > > before calling vfree().
> > > >
> > > > Were you suggesting #2 as a solution? Or am I missing some other way to do this?
> > >
> > > I am suggesting #2, that's the way to do it in the kernel.
> >
> > So I'm concerned about this approach, because it feels like a bunch of
> > unnecessarily wasted memory. While there is no way around doing
> > round_up(PAGE_SIZE) for data itself, it certainly is not necessary to
> > waste almost entire page for struct bpf_array. And given this is going
> > to be used for BPF maps backing global variables, there most probably
> > will be at least 3 (.data, .bss, .rodata) per each program, and could
> > be more. Also, while on x86_64 page is 4k, on other architectures it
> > can be up to 64KB, so this seems wasteful.
>
> With the extra mutex and int you grew struct bpf_map from 192B to 256B,
> that's for every map on the system, unconditionally, and array map has
> an extra pointer even if it doesn't need it.
>
> Increasing "wasted" space when an opt-in feature is selected doesn't
> seem all that terrible to me, especially that the overhead of aligning
> up map size to page size is already necessary.

Well, I've been talking about one more extra page for struct bpf_array
itself, on top of what we already potentially waste for mmap()'ing
array data. But I went ahead and posted v3 with layout we discussed
here, aligning array->value on page boundary. Let's see if you like it
better.

>
> > What's your concern exactly with the way it's implemented in this patch?
>
> Judging by other threads we seem to care about performance of BPF
> (rightly so). Doing an extra pointer deref for every static data access
> seems like an obvious waste.
>
> But then again, it's just an obvious suggestion, take it or leave it..