[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76a7c292-7146-d73c-f5b7-01340a13916e@solarflare.com>
Date: Mon, 12 Feb 2018 10:22:49 +0000
From: Edward Cree <ecree@...arflare.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
<netdev@...r.kernel.org>
Subject: Re: [RFC PATCH bpf-next 0/2] bpf/verifier: simplify subprog tracking
On 10/02/18 03:18, Alexei Starovoitov wrote:
> On Thu, Feb 08, 2018 at 07:31:55PM +0000, Edward Cree wrote:
>> By storing subprog boundaries as a subprogno mark on each insn, rather than
>> a start (and implicit end) for each subprog, we collect a number of gains:
>> * More efficient determination of which subprog contains a given insn, and
>> thus of find_subprog (which subprog begins at a given insn).
>> * Number of verifier passes is reduced, since most of the work is done in
>> the main insn walk (do_check()).
> unfortunately this is the opposite of where the verifier should be heading.
> For bpf libraries and indirect call support the verifier has to be further
> split into more distinct passes. Not less. Since it needs to determine
> function boundaries and general validity of the instructions without
> performing do_check()-style walk.
> The function in the given prog may be called by other bpf functions that will be
> loaded later and just like indirect calls the final 'check all pointers
> and load/stores' pass (currently done by do_check) will done last.
Strictly speaking, any kind of sanity check done before the program can be
do_check() walked can only be a kind of early-fail optimisation, because
the final 'check pointers and load/stores' (and register types in general,
and any data value tracking we may use for bounded loops) can depend on
values returned from callees.
So I am tempted to suggest that any such 'early sanity passes' should be in
addition to, rather than instead of, checks at walk time.
> (In case of indirect calls this pass will done at the time of
> bpf_map_update() call)
Btw I hope you're planning to only have these maps writable by the control
plane, and not from within BPF programs themselves; allowing BPF execution
to trigger a verifier run would be... interesting.
> while all preparatory passes like function boundaries, basic block
> detection, control flow check and general instruction sanity will
> be done before that final run-time linking phase.
> Hence we cannot merge subprog detection pass with anything else.
> It has to be done first.
Why does it *have* to be done first? Why would a verifier that postponed
all checks until the complete prog was available be actually wrong?
>> * Subprogs no longer have to be contiguous; so long as they don't overlap
>> and there are no unreachable insns, verifier is happy. (This does require
>> a small amount of care at jit_subprogs() time to fix up jump offsets, so
>> we could instead disallow this if people prefer.)
> do you have patches to make llvm blend multiple C functions into
> non-contiguous piece of .text code ?
I have an assembler that produces eBPF object files. I can make whatever
crazy .text I want ;-)
> What would be the purpose of such bizarre code generation?
> If not, there is no reason to support such code layout in the verifier.
But the real reason is that disallowing it would have required writing an
extra check, and I wanted to get a first prototype out for discussion ASAP.
>> Some other changes were also included to support this:
>> * Per-subprog info is stored in env->subprog_info, an array of structs,
>> rather than several arrays with a common index.
>> * Call graph is now stored in the new bpf_subprog_info struct; used here for
>> check_max_stack_depth() but may have other uses too.
> The new maximum stack depth algorithm is interesting, but it looks
> unrelated to the rest and much harder to understand
It's basically Kahn's algorithm from https://en.wikipedia.org/wiki/Topological_sorting
if that helps? I could perhaps explain that more fully in comments.
> comparing to existing check_max_stack_depth() algorithm.
> The patches must explain the benefits clearly. Currently it's not the case.
The benefit here, which indeed I didn't clearly explain, is that we replace
a full recursive-walk pass with an algorithm that's O(n) in the number of
subprogs.
>> * LD_ABS and LD_IND were previously disallowed in programs that also contain
>> subprog calls. Now they are only disallowed in callees, i.e. main() can
>> always use them even if it also uses subprog calls. AFAICT this is safe
>> (main()'s r1 arg is still known to be ctx, so prologue can do its stuff).
>> But again it can be disallowed if necessary.
> What is the purpose of allowing ld_abs in calls ?
> The work over the last years have been to avoid ld_abs as much as possible,
> since these instructions are slow, carry a lot of legacy baggage and corner cases
> that cause JITs and verifier to do additional work and make every piece more complex.
> Allowing ld_abs with calls is going into the opposite direction.
> Personally I very much dislike patches that "lets do this just because".
> Nothing in the patch 1 or in this cover letter explain the reason.
Again, it was because it was more effort to forbid than allow (if we walk a
ld_abs before the first call, we don't know yet that calls are coming, so
I'd need to do something like recording in check_cfg that a call was seen)
and I wanted to get the prototype out there.
There are more patches coming, btw; I am not just doing a random collection
of changes but rather working towards a prototype implementation of bounded
loops. (I also have a 'parent-pointer-per-reg' patch, passing tests, in the
series. That cuts out 100 lines!) While it's probably different to the way
you're planning/expecting them to be handled, I think it's worth having both
approaches to compare. There's a reason this series is marked RFC ;-)
> Overall I think the set is more harmful than useful.
>
> If you like to help developing bpf infrastructure please focus
> on replacing signed/unsigned min/max tracking logic with simpler and safer logic.
If you like to keep bringing up [su]minmax tracking, please focus on
answering the points I've repeatedly raised about the alternatives being
confusing and complicated (due to asymmetry) which leads (and historically
led) to them being buggy.
> 1. [su]minmax burning memory and making verifier process slower
Memory is cheap. Do you have numbers that show it's a problem in the real
world for someone, or are you committing premature optimisation? (Even
with (say) 4k explored_states * 11 regs, it's only 1.4MB. All this fuss
over just those 32 bytes per reg... it probably takes more memory just to
store all our arguments about [su]minmax.)
> 2. it's internally inconsistent, since it can determine that smin > smax.
> We had to remove that WARN_ON, but underlying issue remains.
IIRC the old tracking could do the same; if you did two JSGTs you could get
a branch where min_value > max_value. This is a natural result of
following such 'impossible' branches, in no sense specific to [su]minmax.
> 3. it had been source of security bugs
So had the alternative, whose bugs stayed hidden for longer because of the
inelegant, asymmetric complexity of the code.
> 4. The reason it was introduced was to verify validity of variable length
> array access, but it can be done much simpler instead.
> We're dealing with C compiled code here and C arrays have zero till N range.
> There is no reason to track that a variable can be between -100 to -10.
But there is reason to track whether a bound came from a signed or an
unsigned compare, and I don't see why it should be a point against the
_clean_ implementation that it happens to also support unusual things.
> This tracking is a waste of verifier resources for realistic C programs.
And what if someone subtracts a variable with a min_value from another, and
the min_value is necessary to prove the access safe? I suppose that's
'unrealistic', up until the day that someone wants to do it.
Also, my design for bounded loops really does need a min_value in order to
handle an increasing loop counter.
> All of [su]minmax can be replaced with {u32_max_value, can_be_zero, can_be_negative}
> tracking which is 4 bytes and 2 bits of info per bpf_reg_state instead of 32 bytes.
> Such tracking will be more accurate as well.
> llvm can optimize 'var >= 1' compare into 'var != 0' and we've seen
> cases that this optimization breaks minmax tracking, since !=0 simply
> doesn't fit into minmax logic. One can argue reg_set_min_max() needs to track
> this extra boolean state as well,
Well, if real-world programs are hitting this and need it, then yes,
handling for jeq/jne 0 should be changed to give umin >= 1.
> but then it would mean more inconsistency
> in the bpf_reg_state and information contradicting each other.
> Any verifier pass looking at bpf_reg_state at any given time should
> see bpf_reg_state being valid and not guessing why smin is suddenly more
> than smax.
Again, nothing to do with [su]minmax, your approach could have a state in
which max_value=0 and can_be_zero is false, for the exact same reason.
-Ed
Powered by blists - more mailing lists