netdev - Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180210031808.dygaa4w6ehuvuxr5@ast-mbp>
Date:   Fri, 9 Feb 2018 19:18:09 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Edward Cree <ecree@...arflare.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

On Thu, Feb 08, 2018 at 07:31:55PM +0000, Edward Cree wrote:
> By storing subprog boundaries as a subprogno mark on each insn, rather than
>  a start (and implicit end) for each subprog, we collect a number of gains:
> * More efficient determination of which subprog contains a given insn, and
>   thus of find_subprog (which subprog begins at a given insn).
> * Number of verifier passes is reduced, since most of the work is done in
>   the main insn walk (do_check()).

unfortunately this is the opposite of where the verifier should be heading.
For bpf libraries and indirect call support the verifier has to be further
split into more distinct passes. Not less. Since it needs to determine
function boundaries and general validity of the instructions without
performing do_check()-style walk.
The function in the given prog may be called by other bpf functions that will be
loaded later and just like indirect calls the final 'check all pointers
and load/stores' pass (currently done by do_check) will done last.
(In case of indirect calls this pass will done at the time of
bpf_map_update() call)
while all preparatory passes like function boundaries, basic block
detection, control flow check and general instruction sanity will
be done before that final run-time linking phase.
Hence we cannot merge subprog detection pass with anything else.
It has to be done first.

> * Subprogs no longer have to be contiguous; so long as they don't overlap
>   and there are no unreachable insns, verifier is happy.  (This does require
>   a small amount of care at jit_subprogs() time to fix up jump offsets, so
>   we could instead disallow this if people prefer.)

do you have patches to make llvm blend multiple C functions into
non-contiguous piece of .text code ? What would be the purpose of
such bizarre code generation?
If not, there is no reason to support such code layout in the verifier.

> Some other changes were also included to support this:
> * Per-subprog info is stored in env->subprog_info, an array of structs,
>   rather than several arrays with a common index.
> * Call graph is now stored in the new bpf_subprog_info struct; used here for
>   check_max_stack_depth() but may have other uses too.

The new maximum stack depth algorithm is interesting, but it looks
unrelated to the rest and much harder to understand
comparing to existing check_max_stack_depth() algorithm.
The patches must explain the benefits clearly. Currently it's not the case.

> * LD_ABS and LD_IND were previously disallowed in programs that also contain
>   subprog calls.  Now they are only disallowed in callees, i.e. main() can
>   always use them even if it also uses subprog calls.  AFAICT this is safe
>   (main()'s r1 arg is still known to be ctx, so prologue can do its stuff).
>   But again it can be disallowed if necessary.

What is the purpose of allowing ld_abs in calls ?
The work over the last years have been to avoid ld_abs as much as possible,
since these instructions are slow, carry a lot of legacy baggage and corner cases
that cause JITs and verifier to do additional work and make every piece more complex.
Allowing ld_abs with calls is going into the opposite direction.
Personally I very much dislike patches that "lets do this just because".
Nothing in the patch 1 or in this cover letter explain the reason.

Overall I think the set is more harmful than useful.

If you like to help developing bpf infrastructure please focus
on replacing signed/unsigned min/max tracking logic with simpler and safer logic.
1. [su]minmax burning memory and making verifier process slower
2. it's internally inconsistent, since it can determine that smin > smax.
   We had to remove that WARN_ON, but underlying issue remains.
3. it had been source of security bugs
4. The reason it was introduced was to verify validity of variable length
   array access, but it can be done much simpler instead.
We're dealing with C compiled code here and C arrays have zero till N range.
There is no reason to track that a variable can be between -100 to -10.
This tracking is a waste of verifier resources for realistic C programs.
All of [su]minmax can be replaced with {u32_max_value, can_be_zero, can_be_negative}
tracking which is 4 bytes and 2 bits of info per bpf_reg_state instead of 32 bytes.
Such tracking will be more accurate as well.
llvm can optimize 'var >= 1' compare into 'var != 0' and we've seen
cases that this optimization breaks minmax tracking, since !=0 simply
doesn't fit into minmax logic. One can argue reg_set_min_max() needs to track
this extra boolean state as well, but then it would mean more inconsistency
in the bpf_reg_state and information contradicting each other.
Any verifier pass looking at bpf_reg_state at any given time should
see bpf_reg_state being valid and not guessing why smin is suddenly more
than smax.