[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5eec061598dcf_403f2afa5de805bcde@john-XPS-13-9370.notmuch>
Date: Thu, 18 Jun 2020 17:25:57 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>,
John Fastabend <john.fastabend@...il.com>
Cc: Jiri Olsa <jolsa@...hat.com>, Andrii Nakryiko <andriin@...com>,
Jiri Olsa <jolsa@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Yonghong Song <yhs@...com>, Martin KaFai Lau <kafai@...com>,
Jakub Kicinski <kuba@...nel.org>,
David Miller <davem@...hat.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
KP Singh <kpsingh@...omium.org>,
Masanori Misono <m.misono760@...il.com>
Subject: Re: [PATCH] bpf: Allow small structs to be type of function argument
Andrii Nakryiko wrote:
> On Thu, Jun 18, 2020 at 3:50 PM John Fastabend <john.fastabend@...il.com> wrote:
> >
> > Jiri Olsa wrote:
> > > On Wed, Jun 17, 2020 at 04:20:54PM -0700, John Fastabend wrote:
> > > > Jiri Olsa wrote:
> > > > > This way we can have trampoline on function
> > > > > that has arguments with types like:
> > > > >
> > > > > kuid_t uid
> > > > > kgid_t gid
> > > > >
> > > > > which unwind into small structs like:
> > > > >
> > > > > typedef struct {
> > > > > uid_t val;
> > > > > } kuid_t;
> > > > >
> > > > > typedef struct {
> > > > > gid_t val;
> > > > > } kgid_t;
> > > > >
> > > > > And we can use them in bpftrace like:
> > > > > (assuming d_path changes are in)
> > > > >
> > > > > # bpftrace -e 'lsm:path_chown { printf("uid %d, gid %d\n", args->uid, args->gid) }'
> > > > > Attaching 1 probe...
> > > > > uid 0, gid 0
> > > > > uid 1000, gid 1000
> > > > > ...
> > > > >
> > > > > Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> > > > > ---
> > > > > kernel/bpf/btf.c | 12 +++++++++++-
> > > > > 1 file changed, 11 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > > > index 58c9af1d4808..f8fee5833684 100644
> > > > > --- a/kernel/bpf/btf.c
> > > > > +++ b/kernel/bpf/btf.c
> > > > > @@ -362,6 +362,14 @@ static bool btf_type_is_struct(const struct btf_type *t)
> > > > > return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
> > > > > }
> > > > >
> > > > > +/* type is struct and its size is within 8 bytes
> > > > > + * and it can be value of function argument
> > > > > + */
> > > > > +static bool btf_type_is_struct_arg(const struct btf_type *t)
> > > > > +{
> > > > > + return btf_type_is_struct(t) && (t->size <= sizeof(u64));
> > > >
> > > > Can you comment on why sizeof(u64) here? The int types can be larger
> > > > than 64 for example and don't have a similar check, maybe the should
> > > > as well?
> > > >
> > > > Here is an example from some made up program I ran through clang and
> > > > bpftool.
> > > >
> > > > [2] INT '__int128' size=16 bits_offset=0 nr_bits=128 encoding=SIGNED
> > > >
> > > > We also have btf_type_int_is_regular to decide if the int is of some
> > > > "regular" size but I don't see it used in these paths.
> > >
> > > so this small structs are passed as scalars via function arguments,
> > > so the size limit is to fit teir value into register size which holds
> > > the argument
> > >
> > > I'm not sure how 128bit numbers are passed to function as argument,
> > > but I think we can treat them separately if there's a need
> > >
> >
> > Moving Andrii up to the TO field ;)
>
> I've got an upgrade, thanks :)
>
> >
> > Andrii, do we also need a guard on the int type with sizeof(u64)?
> > Otherwise the arg calculation might be incorrect? wdyt did I follow
> > along correctly.
>
> Yes, we probably do. I actually never used __int128 in practice, but
> decided to look at what Clang does for a function accepting __int128.
> Turns out it passed it in two consecutive registers. So:
>
> __weak int bla(__int128 x) { return (int)(x + 1); }
>
> The assembly is:
>
> 38: b7 01 00 00 fe ff ff ff r1 = -2
> 39: b7 02 00 00 ff ff ff ff r2 = -1
> 40: 85 10 00 00 ff ff ff ff call -1
> 41: bc 01 00 00 00 00 00 00 w1 = w0
>
> So low 64-bits go into r1, high 64-bits into r2.
>
> Which means the 1:1 mapping between registers and input arguments
> breaks with __int128, at least for target BPF. I'm too lazy to check
> for x86-64, though.
OK confirms what I suspected. For a fix we should bound int types
here to pointer word size which I think should be safe most everywhere.
I can draft a patch if you haven't done one already. For what its worth
RISC-V had some convention where it would use the even registers for
things. So
foo(int a, __int128 b)
would put a in r0 and b in r2 and r3 leaving a hole in r1. But that
was some old reference manual and might no longer be the case
in reality. Perhaps just spreading hearsay, but the point is we
should say something about what the BPF backend convention
is and write it down. We've started to bump into these things
lately.
Powered by blists - more mailing lists