[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201103175520.spqvqhohtnietnlt@ast-mbp.dhcp.thefacebook.com>
Date: Tue, 3 Nov 2020 09:55:20 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Andrii Nakryiko <andrii@...nel.org>, bpf <bpf@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
On Mon, Nov 02, 2020 at 10:27:20PM -0800, Andrii Nakryiko wrote:
> On Mon, Nov 2, 2020 at 9:10 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 05:58:59PM -0700, Andrii Nakryiko wrote:
> > > @@ -2942,6 +2948,13 @@ struct btf_dedup {
> > > __u32 *hypot_list;
> > > size_t hypot_cnt;
> > > size_t hypot_cap;
> > > + /* Whether hypothethical mapping, if successful, would need to adjust
> > > + * already canonicalized types (due to a new forward declaration to
> > > + * concrete type resolution). In such case, during split BTF dedup
> > > + * candidate type would still be considered as different, because base
> > > + * BTF is considered to be immutable.
> > > + */
> > > + bool hypot_adjust_canon;
> >
> > why one flag per dedup session is enough?
>
> So the entire hypot_xxx state is reset before each struct/union type
> graph equivalence check. Then for each struct/union type we might do
> potentially many type graph equivalence checks against each of
> potential canonical (already deduplicated) struct. Let's keep that in
> mind for the answer below.
>
> > Don't you have a case where some fwd are pointing to base btf and shouldn't
> > be adjusted while some are in split btf and should be?
> > It seems when this flag is set to true it will miss fwd in split btf?
>
> So keeping the above note in mind, let's think about this case. You
> are saying that some FWDs would have candidates in base BTF, right?
> That means that the canonical type we are checking equivalence against
> has to be in the base BTF. That also means that all the canonical type
> graph types are in the base BTF, right? Because no base BTF type can
> reference types from split BTF. This, subsequently, means that no FWDs
> from split BTF graph could have canonical matching types in split BTF,
> because we are comparing split types against only base BTF types.
>
> With that, if hypot_adjust_canon is triggered, *entire graph*
> shouldn't be matched. No single type in that (connected) graph should
> be matched to base BTF. We essentially pretend that canonical type
> doesn't even exist for us (modulo the subtle bit of still recording
> base BTF's FWD mapping to a concrete type in split BTF for FWD-to-FWD
> resolution at the very end, we can ignore that here, though, it's an
> ephemeral bookkeeping discarded after dedup).
>
> In your example you worry about resolving FWD in split BTF to concrete
> type in split BTF. If that's possible (i.e., we have duplicates and
> enough information to infer the FWD-to-STRUCT mapping), then we'll
> have another canonical type to compare against, at which point we'll
> establish FWD-to-STRUCT mapping, like usual, and hypot_adjust_canon
> will stay false (because we'll be staying with split BTF types only).
>
> But honestly, with graphs it can get so complicated that I wouldn't be
> surprised if I'm still missing something. So far, manually checking
> the resulting BTF showed that generated deduped BTF types look
> correct. Few cases where module BTFs had duplicated types from vmlinux
> I was able to easily find where exactly vmlinux had FWD while modules
> had STRUCT/UNION.
>
> But also, by being conservative with hypot_adjust_canon, the worst
> case would be slight duplication of types, which is not the end of the
> world. Everything will keep working, no data will be corrupted, libbpf
> will still perform CO-RE relocation correctly (because memory layout
> of duplicated structs will be consistent across all copies, just like
> it was with task_struct until ring_buffers were renamed).
Yes. That last part is comforting. The explanation also makes sense.
Not worried about it anymore.
Powered by blists - more mailing lists