netdev - Re: [PATCH bpf-next v5 6/9] bpftool: Implement relocations recording for BTFGen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHap4zuD8j7CXwOK2a12=j0j0b7twHs6gwKEBNagdryHWNQyWQ@mail.gmail.com>
Date:   Fri, 4 Feb 2022 14:44:31 -0500
From:   Mauricio Vásquez Bernal <mauricio@...volk.io>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc:     Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Quentin Monnet <quentin@...valent.com>,
        Rafael David Tinoco <rafaeldtinoco@...il.com>,
        Lorenzo Fontana <lorenzo.fontana@...stic.co>,
        Leonardo Di Donato <leonardo.didonato@...stic.co>
Subject: Re: [PATCH bpf-next v5 6/9] bpftool: Implement relocations recording
 for BTFGen

On Wed, Feb 2, 2022 at 5:55 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> On Fri, Jan 28, 2022 at 2:33 PM Mauricio Vásquez <mauricio@...volk.io> wrote:
> >
> > This commit implements the logic to record the relocation information
> > for the different kind of relocations.
> >
> > btfgen_record_field_relo() uses the target specification to save all the
> > types that are involved in a field-based CO-RE relocation. In this case
> > types resolved and added recursively (using btfgen_put_type()).
> > Only the struct and union members and their types) involved in the
> > relocation are added to optimize the size of the generated BTF file.
> >
> > On the other hand, btfgen_record_type_relo() saves the types involved in
> > a type-based CO-RE relocation. In this case all the members for the
> > struct and union types are added. This is not strictly required since
> > libbpf doesn't use them while performing this kind of relocation,
> > however that logic could change on the future. Additionally, we expect
> > that the number of this kind of relocations in an BPF object to be very
> > low, hence the impact on the size of the generated BTF should be
> > negligible.
> >
> > Finally, btfgen_record_enumval_relo() saves the whole enum type for
> > enum-based relocations.
> >
> > Signed-off-by: Mauricio Vásquez <mauricio@...volk.io>
> > Signed-off-by: Rafael David Tinoco <rafael.tinoco@...asec.com>
> > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@...stic.co>
> > Signed-off-by: Leonardo Di Donato <leonardo.didonato@...stic.co>
> > ---
>
> I've been thinking about this in background. This proliferation of
> hashmaps to store used types and their members really adds to
> complexity (and no doubt to memory usage and CPU utilization, even
> though I don't think either is too big for this use case).
>
> What if instead of keeping track of used types and members separately,
> we initialize the original struct btf and its btf_type, btf_member,
> btf_enum, etc types. We can carve out one bit in them to mark whether
> that specific entity was used. That way you don't need any extra
> hashmap maintenance. You just set or check bit on each type or its
> member to figure out if it has to be in the resulting BTF.
>
> This can be highest bit of name_off or type fields, depending on
> specific case. This will work well because type IDs never use highest
> bit and string offset can never be as high as to needing full 32 bits.
>
> You'll probably want to have two copies of target BTF for this, of
> course, but I think simplicity of bookkeeping trumps this
> inefficiency. WDYT?
>

It's a very nice idea indeed. I got a version working with this idea.
I keep two instances of the target BTF (as you suggested) one is only
for keeping track of the used types/members, the other one is used as
source when copying the BTF types and also to run the candidate search
algorithm and so on. Actually there is no need to use the highest bit,
I'm just setting the whole name_off to UINT32_MAX. It works fine
because that copy of the BTF isn't used anywhere else. I'm cleaning
this up and hope to send it early next week.

Thanks for all the feedback!