netdev - Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d8ed2c2-dae0-ba17-559d-f86a892991f1@gmail.com>
Date:   Mon, 12 Feb 2018 21:09:04 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Edward Cree <ecree@...arflare.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org
Subject: Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

On 02/12/2018 02:22 AM, Edward Cree wrote:
> On 10/02/18 03:18, Alexei Starovoitov wrote:
>> On Thu, Feb 08, 2018 at 07:31:55PM +0000, Edward Cree wrote:
>>> By storing subprog boundaries as a subprogno mark on each insn, rather than
>>>  a start (and implicit end) for each subprog, we collect a number of gains:
>>> * More efficient determination of which subprog contains a given insn, and
>>>   thus of find_subprog (which subprog begins at a given insn).
>>> * Number of verifier passes is reduced, since most of the work is done in
>>>   the main insn walk (do_check()).
>> unfortunately this is the opposite of where the verifier should be heading.
>> For bpf libraries and indirect call support the verifier has to be further
>> split into more distinct passes. Not less. Since it needs to determine
>> function boundaries and general validity of the instructions without
>> performing do_check()-style walk.
>> The function in the given prog may be called by other bpf functions that will be
>> loaded later and just like indirect calls the final 'check all pointers
>> and load/stores' pass (currently done by do_check) will done last.
> Strictly speaking, any kind of sanity check done before the program can be
>  do_check() walked can only be a kind of early-fail optimisation, because
>  the final 'check pointers and load/stores' (and register types in general,
>  and any data value tracking we may use for bounded loops) can depend on
>  values returned from callees.

But all the control flow analysis can be done before actual values are known.
Also for bounded loops the required restrictions on values can be known.

> So I am tempted to suggest that any such 'early sanity passes' should be in
>  addition to, rather than instead of, checks at walk time.

Disagree. We can verify control flow early. I view this similar to gcc and
LLVM except instead of optimizing passes we need have verification passes.

>> (In case of indirect calls this pass will done at the time of
>> bpf_map_update() call)
> Btw I hope you're planning to only have these maps writable by the control
>  plane, and not from within BPF programs themselves; allowing BPF execution
>  to trigger a verifier run would be... interesting.
> 
>> while all preparatory passes like function boundaries, basic block
>> detection, control flow check and general instruction sanity will
>> be done before that final run-time linking phase.
>> Hence we cannot merge subprog detection pass with anything else.
>> It has to be done first.
> Why does it *have* to be done first?  Why would a verifier that postponed
>  all checks until the complete prog was available be actually wrong?
> 

I think this is more to keep the code reasonable and understandable. Also
delaying basic block detection and loop detection doesn't seem correct to
me nor really practical. Lots of CFG analysis doesn't care about data values
and I don't see any way to do this inline with the existing passes. Adding
it after is dangerous because the verifier would have to avoid looping or
otherwise tripping up over invalid CFGs.

>>> * Subprogs no longer have to be contiguous; so long as they don't overlap
>>>   and there are no unreachable insns, verifier is happy.  (This does require
>>>   a small amount of care at jit_subprogs() time to fix up jump offsets, so
>>>   we could instead disallow this if people prefer.)
>> do you have patches to make llvm blend multiple C functions into
>> non-contiguous piece of .text code ?
> I have an assembler that produces eBPF object files.  I can make whatever
>  crazy .text I want ;-)

But that doesn't mean the verifier has to support it. I think we can make
the trade-off between simplifying the verifier and trying to handle problematic
code patterns if they are not common. ;)

>> What would be the purpose of such bizarre code generation?
>> If not, there is no reason to support such code layout in the verifier.
> But the real reason is that disallowing it would have required writing an
>  extra check, and I wanted to get a first prototype out for discussion ASAP.
> 

Will need to look over the code again.

>>> Some other changes were also included to support this:
>>> * Per-subprog info is stored in env->subprog_info, an array of structs,
>>>   rather than several arrays with a common index.
>>> * Call graph is now stored in the new bpf_subprog_info struct; used here for
>>>   check_max_stack_depth() but may have other uses too.
>> The new maximum stack depth algorithm is interesting, but it looks
>> unrelated to the rest and much harder to understand
> It's basically Kahn's algorithm from https://en.wikipedia.org/wiki/Topological_sorting
>  if that helps?  I could perhaps explain that more fully in comments.
>> comparing to existing check_max_stack_depth() algorithm.
>> The patches must explain the benefits clearly. Currently it's not the case.
> The benefit here, which indeed I didn't clearly explain, is that we replace
>  a full recursive-walk pass with an algorithm that's O(n) in the number of
>  subprogs.
> 
>>> * LD_ABS and LD_IND were previously disallowed in programs that also contain
>>>   subprog calls.  Now they are only disallowed in callees, i.e. main() can
>>>   always use them even if it also uses subprog calls.  AFAICT this is safe
>>>   (main()'s r1 arg is still known to be ctx, so prologue can do its stuff).
>>>   But again it can be disallowed if necessary.
>> What is the purpose of allowing ld_abs in calls ?
>> The work over the last years have been to avoid ld_abs as much as possible,
>> since these instructions are slow, carry a lot of legacy baggage and corner cases
>> that cause JITs and verifier to do additional work and make every piece more complex.
>> Allowing ld_abs with calls is going into the opposite direction.
>> Personally I very much dislike patches that "lets do this just because".
>> Nothing in the patch 1 or in this cover letter explain the reason.
> Again, it was because it was more effort to forbid than allow (if we walk a
>  ld_abs before the first call, we don't know yet that calls are coming, so
>  I'd need to do something like recording in check_cfg that a call was seen)
>  and I wanted to get the prototype out there.
> There are more patches coming, btw; I am not just doing a random collection
>  of changes but rather working towards a prototype implementation of bounded
>  loops.  (I also have a 'parent-pointer-per-reg' patch, passing tests, in the
>  series.  That cuts out 100 lines!)  While it's probably different to the way
>  you're planning/expecting them to be handled, I think it's worth having both
>  approaches to compare.  There's a reason this series is marked RFC ;-)
> 
>> Overall I think the set is more harmful than useful.
>>
>> If you like to help developing bpf infrastructure please focus
>> on replacing signed/unsigned min/max tracking logic with simpler and safer logic.
> If you like to keep bringing up [su]minmax tracking, please focus on
>  answering the points I've repeatedly raised about the alternatives being
>  confusing and complicated (due to asymmetry) which leads (and historically
>  led) to them being buggy.
> 
>> 1. [su]minmax burning memory and making verifier process slower
> Memory is cheap.  Do you have numbers that show it's a problem in the real
>  world for someone, or are you committing premature optimisation?  (Even
>  with (say) 4k explored_states * 11 regs, it's only 1.4MB.  All this fuss
>  over just those 32 bytes per reg... it probably takes more memory just to
>  store all our arguments about [su]minmax.)
>> 2. it's internally inconsistent, since it can determine that smin > smax.
>>    We had to remove that WARN_ON, but underlying issue remains.
> IIRC the old tracking could do the same; if you did two JSGTs you could get
>  a branch where min_value > max_value.  This is a natural result of
>  following such 'impossible' branches, in no sense specific to [su]minmax.
> 
>> 3. it had been source of security bugs
> So had the alternative, whose bugs stayed hidden for longer because of the
>  inelegant, asymmetric complexity of the code.
> 
>> 4. The reason it was introduced was to verify validity of variable length
>>    array access, but it can be done much simpler instead.
>> We're dealing with C compiled code here and C arrays have zero till N range.
>> There is no reason to track that a variable can be between -100 to -10.
> But there is reason to track whether a bound came from a signed or an
>  unsigned compare, and I don't see why it should be a point against the
>  _clean_ implementation that it happens to also support unusual things.
>> This tracking is a waste of verifier resources for realistic C programs.
> And what if someone subtracts a variable with a min_value from another, and
>  the min_value is necessary to prove the access safe?  I suppose that's
>  'unrealistic', up until the day that someone wants to do it.
> Also, my design for bounded loops really does need a min_value in order to
>  handle an increasing loop counter.

I have some code that is getting close to working on bounded loops. Here
is how I think it should work, (with some annotations on what I have).

1. First task is to ensure all loops are natural loops. A natural loop is
   just a loop that doesn't have any jumps into the middle of the loop, only
   a single entry point. Necessary condition to find induction variables
   later, at least without making things overly complex.

   The more precise definition, the natural loop of back edge a->b is {b}
   plus nodes {n_0,..., n_i} that can reach a without going through b.

   To find these we do the following

   1.a: Build the CFG of all the instructions so we can walk around the
        program easily.
   1.b: Find the basic blocks and build the basic block CFG. After this
        with a few helper calls we can move around the CFG both at block
        level and insn level.
   1.c: Build the dom tree and post dom tree. This is a data structure that
        allows us to actually find the natural loops. I can write more about
        this later.
   1.d: Using the post dom tree verify all the CFG loops are in fact
        natural loops. I made some simplifications here and threw
        out loops that have multiple jumps to the header node and nested
        loops for now. Both can be handled with some additional complexity.

  Versions of the above can be found in gcc and llvm so it appears to be
  fairly regular stuff from the compiler side. For LLVM the passes are
  --loops, --postdomtree. I dumped the various CFG into dot format and
  and graphed them for fun. The above seemed to work fine and rejected the
  small set of "bad" programs I threw at it.

2. Next on the list is to find all the induction variables. These are variables
   of the form `i = i + c` where 'c' is a constant in this case. Its easy
   to get carried away here and let `c` be all sorts of things linear expressions,
   or other induction variables. But requiring it to be a constant simplifies
   the problem a lot and is still very useful.

   Then ensure that the loop variable is a loop induction variable e.g. its
   just one of the values found above.

   Couple comments, the decreasing case can also be handled without much trouble
   I just didn't bother yet, `i = i - c`. Also the 'i = i + c` can not be inside
   a CFG that can be skipped e.g. if/else logic.

3. Finally, ensure the comparison is JGT with a constant. Of course that is just
   one case all the other cases could probably be added as well with case
   analysis. Also it doesn't need to be a constant if it is bounded but starting
   with constant first and then relaxing the constraints later seems like a good
   start to me.

OK, 2,3, are more of a rough sketch, but I can probably get a prototype out in
a few weeks. I need to get some sockmap code out in the next days so busy this
week. Perhaps this weekend. The code currently does 1 and 2 above, where 1 is
tested reasonably well, 2 less so and finally 3 is just a sketch and untested.

 4. The last step is we need some worst case analysis to ensure the loop although
    eventually terminating is not too long. This I've not done much with yet. Here
    is where I guess min values will be useful. The worst case loop count would
    be, `c - min_value` then `(c - min_value) * max(loop_cfg(i))` would be the
    the worst case instruction that might be executed. Where loop_cfg(i) is the
    max insns that can be run through the loop with worst case CFG path. 

I think logically the above is about what we should aim for.

> 
>> All of [su]minmax can be replaced with {u32_max_value, can_be_zero, can_be_negative}
>> tracking which is 4 bytes and 2 bits of info per bpf_reg_state instead of 32 bytes.
>> Such tracking will be more accurate as well.
>> llvm can optimize 'var >= 1' compare into 'var != 0' and we've seen
>> cases that this optimization breaks minmax tracking, since !=0 simply
>> doesn't fit into minmax logic. One can argue reg_set_min_max() needs to track
>> this extra boolean state as well,
> Well, if real-world programs are hitting this and need it, then yes,
>  handling for jeq/jne 0 should be changed to give umin >= 1.
> 
>> but then it would mean more inconsistency
>> in the bpf_reg_state and information contradicting each other.
>> Any verifier pass looking at bpf_reg_state at any given time should
>> see bpf_reg_state being valid and not guessing why smin is suddenly more
>> than smax.
> Again, nothing to do with [su]minmax, your approach could have a state in
>  which max_value=0 and can_be_zero is false, for the exact same reason.
> 
> -Ed
>