netdev - Re: [PATCH bpf-next 3/3] bpf: convert explored

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZkWWCqEJ8mKJjqkF1FpvP+urJ5dcdhneCoPd4wtViOww@mail.gmail.com>
Date:   Tue, 21 May 2019 17:55:06 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Alexei Starovoitov <ast@...nel.org>
Cc:     davem@...emloft.net, Daniel Borkmann <daniel@...earbox.net>,
        Networking <netdev@...r.kernel.org>, bpf@...r.kernel.org,
        Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 3/3] bpf: convert explored_states to hash table

On Tue, May 21, 2019 at 4:08 PM Alexei Starovoitov <ast@...nel.org> wrote:
>
> All prune points inside a callee bpf function most likely will have
> different callsites. For example, if function foo() is called from
> two callsites the half of explored states in all prune points in foo()
> will be useless for subsequent walking of one of those callsites.
> Fortunately explored_states pruning heuristics keeps the number of states
> per prune point small, but walking these states is still a waste of cpu
> time when the callsite of the current state is different from the callsite
> of the explored state.
>
> To improve pruning logic convert explored_states into hash table and
> use simple insn_idx ^ callsite hash to select hash bucket.
> This optimization has no effect on programs without bpf2bpf calls
> and drastically improves programs with calls.
> In the later case it reduces total memory consumption in 1M scale tests
> by almost 3 times (peak_states drops from 5752 to 2016).
>
> Care should be taken when comparing the states for equivalency.
> Since the same hash bucket can now contain states with different indices
> the insn_idx has to be part of verifier_state and compared.
>
> Different hash table sizes and different hash functions were explored,
> but the results were not significantly better vs this patch.
> They can be improved in the future.
>
> Hit/miss heuristic is not counting index miscompare as a miss.
> Otherwise verifier stats become unstable when experimenting
> with different hash functions.
>
> Signed-off-by: Alexei Starovoitov <ast@...nel.org>
> ---
>  include/linux/bpf_verifier.h |  1 +
>  kernel/bpf/verifier.c        | 23 ++++++++++++++++++-----
>  2 files changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 02bba09a0ea1..405b502283c5 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -187,6 +187,7 @@ struct bpf_func_state {
>  struct bpf_verifier_state {
>         /* call stack tracking */
>         struct bpf_func_state *frame[MAX_CALL_FRAMES];
> +       u32 insn_idx;
>         u32 curframe;
>         u32 active_spin_lock;
>         bool speculative;
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 89097a4b1bf3..082f6eefb1c4 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5435,11 +5435,19 @@ enum {
>         BRANCH = 2,
>  };
>
> +static u32 state_htab_size(struct bpf_verifier_env *env)

maybe mark it as inline function? it's called pretty heavily.

> +{
> +       return env->prog->len;
> +}
> +
>  static struct bpf_verifier_state_list **explored_state(
>                                         struct bpf_verifier_env *env,
>                                         int idx)
>  {
> -       return &env->explored_states[idx];
> +       struct bpf_verifier_state *cur = env->cur_state;
> +       struct bpf_func_state *state = cur->frame[cur->curframe];
> +
> +       return &env->explored_states[(idx ^ state->callsite) % state_htab_size(env)];

% is slow, see [1] for faster alternative.

Alternatively, if you can make sure that hash size is power of two,
then multiplicative Fibonacci hashing is preferred ([2]).

[1] https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
[2] https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/

>  }
>
>  static void init_explored_state(struct bpf_verifier_env *env, int idx)
> @@ -6018,7 +6026,8 @@ static void clean_live_states(struct bpf_verifier_env *env, int insn,
>
>         sl = *explored_state(env, insn);
>         while (sl) {
> -               if (sl->state.curframe != cur->curframe)
> +               if (sl->state.insn_idx != insn ||
> +                   sl->state.curframe != cur->curframe)
>                         goto next;
>                 for (i = 0; i <= cur->curframe; i++)
>                         if (sl->state.frame[i]->callsite != cur->frame[i]->callsite)
> @@ -6384,6 +6393,9 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>         clean_live_states(env, insn_idx, cur);
>
>         while (sl) {
> +               states_cnt++;
> +               if (sl->state.insn_idx != insn_idx)
> +                       goto next;

Shouldn't this be checked inside states_equal? Or you are trying to
avoid function call overhead? If the latter is the case, then you
should probably compare curframe as well here?

>                 if (states_equal(env, &sl->state, cur)) {
>                         sl->hit_cnt++;
>                         /* reached equivalent register/stack state,
> @@ -6401,7 +6413,6 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                                 return err;
>                         return 1;
>                 }
> -               states_cnt++;
>                 sl->miss_cnt++;
>                 /* heuristic to determine whether this state is beneficial
>                  * to keep checking from state equivalence point of view.
> @@ -6428,6 +6439,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                         sl = *pprev;
>                         continue;
>                 }
> +next:
>                 pprev = &sl->next;
>                 sl = *pprev;
>         }
> @@ -6459,6 +6471,7 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>                 kfree(new_sl);
>                 return err;
>         }
> +       new->insn_idx = insn_idx;
>         new_sl->next = *explored_state(env, insn_idx);
>         *explored_state(env, insn_idx) = new_sl;
>         /* connect new state to parentage chain. Current frame needs all
> @@ -8138,7 +8151,7 @@ static void free_states(struct bpf_verifier_env *env)
>         if (!env->explored_states)
>                 return;
>
> -       for (i = 0; i < env->prog->len; i++) {
> +       for (i = 0; i < state_htab_size(env); i++) {
>                 sl = env->explored_states[i];
>
>                 while (sl) {
> @@ -8246,7 +8259,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
>                         goto skip_full_check;
>         }
>
> -       env->explored_states = kvcalloc(env->prog->len,
> +       env->explored_states = kvcalloc(state_htab_size(env),
>                                        sizeof(struct bpf_verifier_state_list *),
>                                        GFP_USER);
>         ret = -ENOMEM;
> --
> 2.20.0
>