netdev - Re: [PATCH bpf-next 2/3] bpf: Add verifier checks for bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzbq=ysuE90OkpCXxkm-7_MewANteSQQj_HYuTkVbwNhhA@mail.gmail.com>
Date:   Wed, 2 Jun 2021 15:38:18 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 2/3] bpf: Add verifier checks for bpf_timer.

On Wed, Jun 2, 2021 at 3:34 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> On Wed, May 26, 2021 at 9:03 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > From: Alexei Starovoitov <ast@...nel.org>
> >
> > Add appropriate safety checks for bpf_timer:
> > - restrict to array, hash, lru. per-cpu maps cannot be supported.
> > - kfree bpf_timer during map_delete_elem and map_free.
> > - verifier btf checks.
> > - safe interaction with lookup/update/delete operations and iterator.
> > - relax the first field only requirement of the previous patch.
> > - allow bpf_timer in global data and search for it in datasec.

I'll mention it here for completeness. I don't think safety
implications are worth it to support timer or spinlock in
memory-mapped maps. It's way too easy to abuse it (or even
accidentally corrupt kernel state). Sure it's nice, but doing an
explicit single-element map for "global" timer is just fine. And it
generalizes nicely to having 2, 3, ..., N timers.

> > - check prog_rdonly, frozen flags.
> > - mmap is allowed. otherwise global timer is not possible.
> >
> > Signed-off-by: Alexei Starovoitov <ast@...nel.org>
> > ---
> >  include/linux/bpf.h        | 36 +++++++++++++-----
> >  include/linux/btf.h        |  1 +
> >  kernel/bpf/arraymap.c      |  7 ++++
> >  kernel/bpf/btf.c           | 77 +++++++++++++++++++++++++++++++-------
> >  kernel/bpf/hashtab.c       | 53 ++++++++++++++++++++------
> >  kernel/bpf/helpers.c       |  2 +-
> >  kernel/bpf/local_storage.c |  4 +-
> >  kernel/bpf/syscall.c       | 23 ++++++++++--
> >  kernel/bpf/verifier.c      | 30 +++++++++++++--
> >  9 files changed, 190 insertions(+), 43 deletions(-)
> >
>
> [...]
>
> >  /* copy everything but bpf_spin_lock */
> >  static inline void copy_map_value(struct bpf_map *map, void *dst, void *src)
> >  {
> > +       u32 off = 0, size = 0;
> > +
> >         if (unlikely(map_value_has_spin_lock(map))) {
> > -               u32 off = map->spin_lock_off;
> > +               off = map->spin_lock_off;
> > +               size = sizeof(struct bpf_spin_lock);
> > +       } else if (unlikely(map_value_has_timer(map))) {
> > +               off = map->timer_off;
> > +               size = sizeof(struct bpf_timer);
> > +       }
>
> so the need to handle 0, 1, or 2 gaps seems to be the only reason to
> disallow both bpf_spinlock and bpf_timer in one map element, right?
> Isn't it worth addressing it from the very beginning to lift the
> artificial restriction? E.g., for speed, you'd do:
>
> if (likely(neither spinlock nor timer)) {
>  /* fastest pass */
> } else if (only one of spinlock or timer) {
>   /* do what you do here */
> } else {
>   int off1, off2, sz1, sz2;
>
>   if (spinlock_off < timer_off) {
>     off1 = spinlock_off;
>     sz1 = spinlock_sz;
>     off2 = timer_off;
>     sz2 = timer_sz;
>   } else {
>     ... you get the idea
>   }
>
>   memcpy(0, off1);
>   memcpy(off1+sz1, off2);
>   memcpy(off2+sz2, total_sz);
> }
>
> It's not that bad, right?
>
> >
> > +       if (unlikely(size)) {
> >                 memcpy(dst, src, off);
> > -               memcpy(dst + off + sizeof(struct bpf_spin_lock),
> > -                      src + off + sizeof(struct bpf_spin_lock),
> > -                      map->value_size - off - sizeof(struct bpf_spin_lock));
> > +               memcpy(dst + off + size,
> > +                      src + off + size,
> > +                      map->value_size - off - size);
> >         } else {
> >                 memcpy(dst, src, map->value_size);
> >         }
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index f386f85aee5c..0a828dc4968e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3241,6 +3241,15 @@ static int check_map_access(struct bpf_verifier_env *env, u32 regno,
> >                         return -EACCES;
> >                 }
> >         }
> > +       if (map_value_has_timer(map)) {
> > +               u32 t = map->timer_off;
> > +
> > +               if (reg->smin_value + off < t + sizeof(struct bpf_timer) &&
>
> <= ? Otherwise we allow accessing the first byte, unless I'm mistaken.
>
> > +                    t < reg->umax_value + off + size) {
> > +                       verbose(env, "bpf_timer cannot be accessed directly by load/store\n");
> > +                       return -EACCES;
> > +               }
> > +       }
> >         return err;
> >  }
> >
> > @@ -4675,9 +4684,24 @@ static int process_timer_func(struct bpf_verifier_env *env, int regno,
> >                         map->name);
> >                 return -EINVAL;
> >         }
> > -       if (val) {
> > -               /* todo: relax this requirement */
> > -               verbose(env, "bpf_timer field can only be first in the map value element\n");
>
> ok, this was confusing, but now I see why you did that...
>
> > +       if (!map_value_has_timer(map)) {
> > +               if (map->timer_off == -E2BIG)
> > +                       verbose(env,
> > +                               "map '%s' has more than one 'struct bpf_timer'\n",
> > +                               map->name);
> > +               else if (map->timer_off == -ENOENT)
> > +                       verbose(env,
> > +                               "map '%s' doesn't have 'struct bpf_timer'\n",
> > +                               map->name);
> > +               else
> > +                       verbose(env,
> > +                               "map '%s' is not a struct type or bpf_timer is mangled\n",
> > +                               map->name);
> > +               return -EINVAL;
> > +       }
> > +       if (map->timer_off != val + reg->off) {
> > +               verbose(env, "off %lld doesn't point to 'struct bpf_timer' that is at %d\n",
> > +                       val + reg->off, map->timer_off);
> >                 return -EINVAL;
> >         }
> >         WARN_ON(meta->map_ptr);
> > --
> > 2.30.2
> >