netdev - Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Feb 2018 18:14:53 +0000
From:   Edward Cree <ecree@...arflare.com>
To:     John Fastabend <john.fastabend@...il.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        <netdev@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

On 13/02/18 05:09, John Fastabend wrote:
> I have some code that is getting close to working on bounded loops. Here
> is how I think it should work, (with some annotations on what I have).
Thanks for this!  For comparison, since my code is also 'getting close to
 working' here's how my approach works.

First, we get rid of (most of) check_cfg().  Replace it with a function
 that just puts the state-list marks on jump targets (including following-
 insns of conditional jumps and calls), then does a quick check that every
 insn following any JMP-class insn has a state-list mark (this suffices to
 detect statically-unreachable code).  2*O(n) in insn_cnt, and much easier
 to understand than check_cfg (I only recently figured out that it's also
 a topological sort algorithm).

Now of course at this point there is nothing to prevent loops at all, so in
 is_state_visited we now add a check where we search backwards through the
 parentage chain for a state with the same insn_idx as us.  (We have to
 change a couple of things for this to work: store the insn_idx of the
 current frame instead of the callsite (insn_idx of the calling frame), and
 thanks to function calls it turns out that the chain we want doesn't match
 that for any of the registers.  So I gave registers their own parentage
 chains (that's a nice simplification and worth doing in itself) and added
 a new chain to func_states that does what I want.)  If we find such a state,
 we have looped, so we reject the prog.

So far the diffstat has 159 insertions, 353 deletions.  The downside of
 course is that walking the parentage chain all the time to look for loops
 is going to hurt performance.  I haven't yet come up with a way around that
 but some time spent meditating on data structures might turn something up ;)

Next we want to start allowing the loop as long as it's bounded.  So when we
 detect the loop in is_state_visited, we look at the last jump we did.  (With
 a slight tweak we can make this always available in prev_insn_idx, even if
 we elided the push/pop_stack in check_cond_jmp_op.)  We then look at the
 jump insn, the old state (conveniently returned from our loop detection
 routine!) and the new state, to decide whether the loop is moving towards
 termination.  The test I'm using for now shares some "it's just one case"
 limits as your point 3 :-)  For now it's:
  a) Is the jump a JLT with a constant?  (Same reason as yours.)
  b) Are we in the 'jump true' branch?  (I haven't quite figured out how to
     make sure of this in general, checking insn_idx < prev_insn_idx works
     but rules out back-edges with forward jumps.  Fortunately those only
     occur in unnatural loops so we don't have to support them all.)
  c) Does the jump reg hold a SCALAR_VALUE in both old and new states?
  d) Did the jump reg->umin_value increase from old to new states?
  e) How _fast_ did it increase?  Rule for now is to take the delta, multiply
     it by 16, and if that exceeds the jump compare constant, we're happy.
     This doesn't actually limit loops to 16 times, because the delta could
     decay exponentially - the worst case comes out as 687 iterations.  But
     we could do something cleverer with recording the delta so we can insist
     on linear induction variables (i = i + c) so that we can give those more
     iterations without giving too many to the worst case exponential.

If all of these checks pass, then the loop is (so far) behaving itself.  So
 we mark the old state with a "this is being looped on" flag (which prevents
 it from being used for pruning.  I haven't yet come up with a way to clear
 the flag when all the state's descendants have reached an exit, so this
 will unfortunately reduce pruning effectiveness) and then we carry on the
 walk.  do_check() will ultimately walk around the loop as many times as the
 program will.

Of course, there's one problem here: when do_check walks the loop for the
 last time, and knows that the loop test has to be false... it will still
 follow the branch anyway and keep walking forever.  So I also had to add a
 patch to not follow 'impossible' branches, determined by the jump register
 getting inconsistent bounds.  This is similar to what check_cond_jmp_op()
 already does with JEQ/JNE of const reg against const imm.

>     The worst case loop count would
>     be, `c - min_value` then `(c - min_value) * max(loop_cfg(i))` would be the
>     the worst case instruction that might be executed. Where loop_cfg(i) is the
>     max insns that can be run through the loop with worst case CFG path.
Currently I don't have any equivalent of that loop_cfg(i) factor, so I'm
 applying the same max iteration count to all loops regardless of the size of
 the loop body.  But I guess I could do something like counting insns since
 parent state and recording that in the explored_states, so that when I have
 my detected loop I can count how long it is.

Anyway, that's my design, I hope it wasn't _too_ horrifying, and I hope to
 have some RFC patches in the next week or so.

-Ed