linux-kernel - Re: [PATCH bpf-next v2 02/28] bpf: introduce hid program type

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO-hwJJqP5iivQZOu0LTYa1D5OuM_aVi=LH27Udc_VYkbFsrww@mail.gmail.com>
Date:   Fri, 11 Mar 2022 18:16:42 +0100
From:   Benjamin Tissoires <benjamin.tissoires@...hat.com>
To:     Song Liu <song@...nel.org>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Jiri Kosina <jikos@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Shuah Khan <shuah@...nel.org>,
        Dave Marchevsky <davemarchevsky@...com>,
        Joe Stringer <joe@...ium.io>,
        Tero Kristo <tero.kristo@...ux.intel.com>,
        open list <linux-kernel@...r.kernel.org>,
        "open list:HID CORE LAYER" <linux-input@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        linux-kselftest@...r.kernel.org
Subject: Re: [PATCH bpf-next v2 02/28] bpf: introduce hid program type

On Tue, Mar 8, 2022 at 10:20 AM Benjamin Tissoires
<benjamin.tissoires@...hat.com> wrote:
>
> On Tue, Mar 8, 2022 at 1:57 AM Song Liu <song@...nel.org> wrote:
> >
> > On Mon, Mar 7, 2022 at 10:39 AM Benjamin Tissoires
> > <benjamin.tissoires@...hat.com> wrote:
> > >
> > > On Sat, Mar 5, 2022 at 1:03 AM Song Liu <song@...nel.org> wrote:
> > > >
> > > > On Fri, Mar 4, 2022 at 9:31 AM Benjamin Tissoires
> > > > <benjamin.tissoires@...hat.com> wrote:
> > > > >
[...]
> > > > > +struct hid_bpf_ctx {
> > > > > +       enum hid_bpf_event type;        /* read-only */
> > > > > +       __u16 allocated_size;           /* the allocated size of data below (RO) */
> > > >
> > > > There is a (6-byte?) hole here.
> > > >
> > > > > +       struct hid_device *hdev;        /* read-only */
> > > > > +
> > > > > +       __u16 size;                     /* used size in data (RW) */
> > > > > +       __u8 data[];                    /* data buffer (RW) */
> > > > > +};
> > > >
> > > > Do we really need hit_bpf_ctx in uapi? Maybe we can just use it
> > > > from vmlinuxh?
> > >
> > > I had a thought at this context today, and I think I am getting to the
> > > limit of what I understand.
> > >
> > > My first worry is that the way I wrote it there, with a variable data
> > > field length is that this is not forward compatible. Unless BTF and
> > > CORE are making magic, this will bite me in the long run IMO.
> > >
> > > But then, you are talking about not using uapi, and I am starting to
> > > wonder: am I doing the things correctly?
> > >
> > > To solve my first issue (and the weird API I had to introduce in the
> > > bpf_hid_get/set_data), I came up to the following:
> > > instead of exporting the data directly in the context, I could create
> > > a helper bpf_hid_get_data_buf(ctx, const uint size) that returns a
> > > RET_PTR_TO_ALLOC_MEM_OR_NULL in the same way bpf_ringbuf_reserve()
> > > does.
> > >
> > > This way, I can directly access the fields within the bpf program
> > > without having to worry about the size.
> > >
> > > But now, I am wondering whether the uapi I defined here is correct in
> > > the way CORE works.
> > >
> > > My goal is to have HID-BPF programs to be CORE compatible, and not
> > > have to recompile them depending on the underlying kernel.
> > >
> > > I can not understand right now if I need to add some other BTF helpers
> > > in the same way the access to struct xdp_md and struct xdp_buff are
> > > converted between one and other, or if defining a forward compatible
> > > struct hid_bpf_ctx is enough.
> > > As far as I understand, .convert_ctx_access allows to export a stable
> > > uapi to the bpf prog users with the verifier doing the conversion
> > > between the structs for me. But is this really required for all the
> > > BPF programs if we want them to be CORE?
> > >
> > > Also, I am starting to wonder if I should not hide fields in the
> > > context to the users. The .data field could be a pointer and only
> > > accessed through the helper I mentioned above. This would be forward
> > > compatible, and also allows to use whatever available memory in the
> > > kernel to be forwarded to the BPF program. This way I can skip the
> > > memcpy part and work directly with the incoming dma data buffer from
> > > the IRQ.
> > >
> > > But is it best practice to do such a thing?
> >
> > I think .convert_ctx_access is the way to go if we want to access the data
> > buffer without memcpy. I am not sure how much work is needed to make
> > it compatible with CORE though.

I spent the week working on that, and I am really amazed at how smart
and simple the .convert_ctx_access is working.

With that in place, I can hide the internal fields and export
"virtual" public fields that are remapped on the fly by the kernel at
load time :)

> >
> > To make sure I understand the case, do we want something like
> >
> > bpf_prog(struct hid_bpf_ctx *ctx)
> > {
> >     /* makes sure n < ctx->size */
> >     x = ctx->data[n]; /* read data */
> >     ctx->data[n] = <something>; /* write data */
> >     ctx->size = <something <= n>; /* change data size */
> > }
> >
> > We also need it to be CORE, so that we may modify hid_bpf_ctx by
> > inserting more members to it before data.
> >
> > Is this accurate?
> >

I have been trying to implement that, based on the PTR_TO_PACKET implementation.
It works, but is kind of verbose while using it because we need to
teach the verifier about the size of the arrays.

For instance, I have the following:

int bpf_prog(struct hid_bpf_ctx *ctx)
{
  /* we need to store a pointer on the stack to store its accessible size */
  __u8 *data = ctx->data;

  if (data + 3 > ctx->data_end)
    return 0; /* EPERM, bounds check */

  data[1] = data[0] + 3;

  return 0;
}

This is OK, but it gets worse if I want to access a random offset:

__s64 offset = 0;
int bpf_prog(struct hid_bpf_ctx *ctx)
{
  __u8 *data = ctx->data;
  __u16 *x;

  /* first assign the size of data */
  if (data + 4 > ctx->data_end)
    return 0; /* EPERM, bounds check */

  /* teach the verifier the range of offset (needs to be a s64) */
  if (offset >= 0 && offset < 2) {
    x = (__u16 *)&data[offset];

    /* now store the size of x in its register */
    if (x + 2 > ctx->data_end)
      return 0;

    /* finally we can read/write the data */
    *x += 1;
  }

  return 0;
}

OTOH, I managed to define a simpler helper that allows me to return a
pointer to the internal data:

BPF_CALL_3(bpf_hid_get_data, struct hid_bpf_ctx_kern*, ctx, u64,
offset, u64, size)
  {
          if (!size)
                  return (unsigned long)0;

          if (offset + size > ctx->data_end - ctx->data)
                  return (unsigned long)0;

          return (unsigned long)(ctx->data + offset);
  }

  static const struct bpf_func_proto bpf_hid_get_data_proto = {
          .func      = bpf_hid_get_data,
          .gpl_only  = true,
          .ret_type  = RET_PTR_TO_ALLOC_MEM_OR_NULL,
          .arg1_type = ARG_PTR_TO_CTX,
          .arg2_type = ARG_ANYTHING,
          .arg3_type = ARG_CONST_ALLOC_SIZE_OR_ZERO,
  };

 Which makes the previous bpf code into:

__u64 offset = 0;
int bpf_prog(struct hid_bpf_ctx *ctx)
{
  __u16 *x = bpf_hid_get_data(ctx, offset, 2);

  if (!x)
    return 0; /* EPERM, bounds check */

  *x += 1;

  return 0;
}

The advantage of both of those solutions is that they are removing the
need for bpf_hid_{get|set}_bytes().

The second solution is much simpler in terms of usage, but it doesn't
feel "right" to have to reimplement the wheel when we should have
direct array accesses.
OTOH, the first solution with the packet implementation is bulky and
requiring users to do that for every single access of the data is IMO
way too much...

Any thoughts?

Cheers,
Benjamin