netdev - Re: [PATCH bpf-next 1/2] bpf: track contents of read-only maps as scalars

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzaLHY9MHbp27VxvVZcKWvbO43F2n6frKi_8kgqCXMDKMg@mail.gmail.com>
Date:   Tue, 8 Oct 2019 16:49:30 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Martin Lau <kafai@...com>
Cc:     "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Alexei Starovoitov <ast@...com>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next 1/2] bpf: track contents of read-only maps as scalars

On Tue, Oct 8, 2019 at 2:53 PM Martin Lau <kafai@...com> wrote:
>
> On Tue, Oct 08, 2019 at 12:45:47PM -0700, Andrii Nakryiko wrote:
> > Maps that are read-only both from BPF program side and user space side
> > have their contents constant, so verifier can track referenced values
> > precisely and use that knowledge for dead code elimination, branch
> > pruning, etc. This patch teaches BPF verifier how to do this.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@...com>
> > ---
> >  kernel/bpf/verifier.c | 58 +++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 56 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index ffc3e53f5300..1e4e4bd64ca5 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -2739,6 +2739,42 @@ static void coerce_reg_to_size(struct bpf_reg_state *reg, int size)
> >       reg->smax_value = reg->umax_value;
> >  }
> >
> > +static bool bpf_map_is_rdonly(const struct bpf_map *map)
> > +{
> > +     return (map->map_flags & BPF_F_RDONLY_PROG) &&
> > +            ((map->map_flags & BPF_F_RDONLY) || map->frozen);
> > +}
> > +
> > +static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val)
> > +{
> > +     void *ptr;
> > +     u64 addr;
> > +     int err;
> > +
> > +     err = map->ops->map_direct_value_addr(map, &addr, off + size);
> Should it be "off" instead of "off + size"?

>From array_map_direct_value_addr() code, offset is used only to check
that access is happening within array value bounds. It's not used to
calculate returned pointer.
But now re-reading its code again, I think this check is wrong:

if (off >= map->value_size)
        break;

It has to be (off > map->value_size). But it seems like this whole
interface is counter-intuitive.

I'm wondering if Daniel can clarify the intent behind this particular behavior.

For now the easiest fix is to pass (off + size - 1). But maybe we
should change the contract to be something like

int map_direct_value_addr(const struct bpf_map *map, u64 off, int
size, void *ptr)

This then can validate that entire access in the range of [off, off +
size) is acceptable to a map, and then return void * pointer according
to given off. Thoughts?

>
> > +     if (err)
> > +             return err;
> > +     ptr = (void *)addr + off;
> > +
> > +     switch (size) {
> > +     case sizeof(u8):
> > +             *val = (u64)*(u8 *)ptr;
> > +             break;
> > +     case sizeof(u16):
> > +             *val = (u64)*(u16 *)ptr;
> > +             break;
> > +     case sizeof(u32):
> > +             *val = (u64)*(u32 *)ptr;
> > +             break;
> > +     case sizeof(u64):
> > +             *val = *(u64 *)ptr;
> > +             break;
> > +     default:
> > +             return -EINVAL;
> > +     }
> > +     return 0;
> > +}
> > +
> >  /* check whether memory at (regno + off) is accessible for t = (read | write)
> >   * if t==write, value_regno is a register which value is stored into memory
> >   * if t==read, value_regno is a register which will receive the value from memory
> > @@ -2776,9 +2812,27 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
> >               if (err)
> >                       return err;
> >               err = check_map_access(env, regno, off, size, false);
> > -             if (!err && t == BPF_READ && value_regno >= 0)
> > -                     mark_reg_unknown(env, regs, value_regno);
> > +             if (!err && t == BPF_READ && value_regno >= 0) {
> > +                     struct bpf_map *map = reg->map_ptr;
> > +
> > +                     /* if map is read-only, track its contents as scalars */
> > +                     if (tnum_is_const(reg->var_off) &&
> > +                         bpf_map_is_rdonly(map) &&
> > +                         map->ops->map_direct_value_addr) {
> > +                             int map_off = off + reg->var_off.value;
> > +                             u64 val = 0;
> >
> > +                             err = bpf_map_direct_read(map, map_off, size,
> > +                                                       &val);
> > +                             if (err)
> > +                                     return err;
> > +
> > +                             regs[value_regno].type = SCALAR_VALUE;
> > +                             __mark_reg_known(&regs[value_regno], val);
> > +                     } else {
> > +                             mark_reg_unknown(env, regs, value_regno);
> > +                     }
> > +             }
> >       } else if (reg->type == PTR_TO_CTX) {
> >               enum bpf_reg_type reg_type = SCALAR_VALUE;
> >
> > --
> > 2.17.1
> >