netdev - Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Feb 2018 10:22:49 +0000
From:   Edward Cree <ecree@...arflare.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        <netdev@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking

On 10/02/18 03:18, Alexei Starovoitov wrote:
> On Thu, Feb 08, 2018 at 07:31:55PM +0000, Edward Cree wrote:
>> By storing subprog boundaries as a subprogno mark on each insn, rather than
>>  a start (and implicit end) for each subprog, we collect a number of gains:
>> * More efficient determination of which subprog contains a given insn, and
>>   thus of find_subprog (which subprog begins at a given insn).
>> * Number of verifier passes is reduced, since most of the work is done in
>>   the main insn walk (do_check()).
> unfortunately this is the opposite of where the verifier should be heading.
> For bpf libraries and indirect call support the verifier has to be further
> split into more distinct passes. Not less. Since it needs to determine
> function boundaries and general validity of the instructions without
> performing do_check()-style walk.
> The function in the given prog may be called by other bpf functions that will be
> loaded later and just like indirect calls the final 'check all pointers
> and load/stores' pass (currently done by do_check) will done last.
Strictly speaking, any kind of sanity check done before the program can be
 do_check() walked can only be a kind of early-fail optimisation, because
 the final 'check pointers and load/stores' (and register types in general,
 and any data value tracking we may use for bounded loops) can depend on
 values returned from callees.
So I am tempted to suggest that any such 'early sanity passes' should be in
 addition to, rather than instead of, checks at walk time.
> (In case of indirect calls this pass will done at the time of
> bpf_map_update() call)
Btw I hope you're planning to only have these maps writable by the control
 plane, and not from within BPF programs themselves; allowing BPF execution
 to trigger a verifier run would be... interesting.

> while all preparatory passes like function boundaries, basic block
> detection, control flow check and general instruction sanity will
> be done before that final run-time linking phase.
> Hence we cannot merge subprog detection pass with anything else.
> It has to be done first.
Why does it *have* to be done first?  Why would a verifier that postponed
 all checks until the complete prog was available be actually wrong?

>> * Subprogs no longer have to be contiguous; so long as they don't overlap
>>   and there are no unreachable insns, verifier is happy.  (This does require
>>   a small amount of care at jit_subprogs() time to fix up jump offsets, so
>>   we could instead disallow this if people prefer.)
> do you have patches to make llvm blend multiple C functions into
> non-contiguous piece of .text code ?
I have an assembler that produces eBPF object files.  I can make whatever
 crazy .text I want ;-)
> What would be the purpose of such bizarre code generation?
> If not, there is no reason to support such code layout in the verifier.
But the real reason is that disallowing it would have required writing an
 extra check, and I wanted to get a first prototype out for discussion ASAP.

>> Some other changes were also included to support this:
>> * Per-subprog info is stored in env->subprog_info, an array of structs,
>>   rather than several arrays with a common index.
>> * Call graph is now stored in the new bpf_subprog_info struct; used here for
>>   check_max_stack_depth() but may have other uses too.
> The new maximum stack depth algorithm is interesting, but it looks
> unrelated to the rest and much harder to understand
It's basically Kahn's algorithm from https://en.wikipedia.org/wiki/Topological_sorting
 if that helps?  I could perhaps explain that more fully in comments.
> comparing to existing check_max_stack_depth() algorithm.
> The patches must explain the benefits clearly. Currently it's not the case.
The benefit here, which indeed I didn't clearly explain, is that we replace
 a full recursive-walk pass with an algorithm that's O(n) in the number of
 subprogs.

>> * LD_ABS and LD_IND were previously disallowed in programs that also contain
>>   subprog calls.  Now they are only disallowed in callees, i.e. main() can
>>   always use them even if it also uses subprog calls.  AFAICT this is safe
>>   (main()'s r1 arg is still known to be ctx, so prologue can do its stuff).
>>   But again it can be disallowed if necessary.
> What is the purpose of allowing ld_abs in calls ?
> The work over the last years have been to avoid ld_abs as much as possible,
> since these instructions are slow, carry a lot of legacy baggage and corner cases
> that cause JITs and verifier to do additional work and make every piece more complex.
> Allowing ld_abs with calls is going into the opposite direction.
> Personally I very much dislike patches that "lets do this just because".
> Nothing in the patch 1 or in this cover letter explain the reason.
Again, it was because it was more effort to forbid than allow (if we walk a
 ld_abs before the first call, we don't know yet that calls are coming, so
 I'd need to do something like recording in check_cfg that a call was seen)
 and I wanted to get the prototype out there.
There are more patches coming, btw; I am not just doing a random collection
 of changes but rather working towards a prototype implementation of bounded
 loops.  (I also have a 'parent-pointer-per-reg' patch, passing tests, in the
 series.  That cuts out 100 lines!)  While it's probably different to the way
 you're planning/expecting them to be handled, I think it's worth having both
 approaches to compare.  There's a reason this series is marked RFC ;-)

> Overall I think the set is more harmful than useful.
>
> If you like to help developing bpf infrastructure please focus
> on replacing signed/unsigned min/max tracking logic with simpler and safer logic.
If you like to keep bringing up [su]minmax tracking, please focus on
 answering the points I've repeatedly raised about the alternatives being
 confusing and complicated (due to asymmetry) which leads (and historically
 led) to them being buggy.

> 1. [su]minmax burning memory and making verifier process slower
Memory is cheap.  Do you have numbers that show it's a problem in the real
 world for someone, or are you committing premature optimisation?  (Even
 with (say) 4k explored_states * 11 regs, it's only 1.4MB.  All this fuss
 over just those 32 bytes per reg... it probably takes more memory just to
 store all our arguments about [su]minmax.)
> 2. it's internally inconsistent, since it can determine that smin > smax.
>    We had to remove that WARN_ON, but underlying issue remains.
IIRC the old tracking could do the same; if you did two JSGTs you could get
 a branch where min_value > max_value.  This is a natural result of
 following such 'impossible' branches, in no sense specific to [su]minmax.

> 3. it had been source of security bugs
So had the alternative, whose bugs stayed hidden for longer because of the
 inelegant, asymmetric complexity of the code.

> 4. The reason it was introduced was to verify validity of variable length
>    array access, but it can be done much simpler instead.
> We're dealing with C compiled code here and C arrays have zero till N range.
> There is no reason to track that a variable can be between -100 to -10.
But there is reason to track whether a bound came from a signed or an
 unsigned compare, and I don't see why it should be a point against the
 _clean_ implementation that it happens to also support unusual things.
> This tracking is a waste of verifier resources for realistic C programs.
And what if someone subtracts a variable with a min_value from another, and
 the min_value is necessary to prove the access safe?  I suppose that's
 'unrealistic', up until the day that someone wants to do it.
Also, my design for bounded loops really does need a min_value in order to
 handle an increasing loop counter.

> All of [su]minmax can be replaced with {u32_max_value, can_be_zero, can_be_negative}
> tracking which is 4 bytes and 2 bits of info per bpf_reg_state instead of 32 bytes.
> Such tracking will be more accurate as well.
> llvm can optimize 'var >= 1' compare into 'var != 0' and we've seen
> cases that this optimization breaks minmax tracking, since !=0 simply
> doesn't fit into minmax logic. One can argue reg_set_min_max() needs to track
> this extra boolean state as well,
Well, if real-world programs are hitting this and need it, then yes,
 handling for jeq/jne 0 should be changed to give umin >= 1.

> but then it would mean more inconsistency
> in the bpf_reg_state and information contradicting each other.
> Any verifier pass looking at bpf_reg_state at any given time should
> see bpf_reg_state being valid and not guessing why smin is suddenly more
> than smax.
Again, nothing to do with [su]minmax, your approach could have a state in
 which max_value=0 and can_be_zero is false, for the exact same reason.

-Ed