[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f7c75d75-71d5-096a-ffe4-66eeb11ad594@solarflare.com>
Date: Mon, 26 Feb 2018 12:53:32 +0000
From: Edward Cree <ecree@...arflare.com>
To: John Fastabend <john.fastabend@...il.com>,
netdev <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [RFC PATCH bpf-next 05/12] bpf/verifier: detect loops dynamically
rather than statically
On 23/02/18 22:27, John Fastabend wrote:
> On 02/23/2018 09:40 AM, Edward Cree wrote:
>> Add in a new chain of parent states, which does not cross function-call
>> boundaries, and check whether our current insn_idx appears anywhere in
>> the chain. Since all jump targets have state-list marks (now placed
>> by prepare_cfg_marks(), which replaces check_cfg()), it suffices to
>> thread this chain through the existing explored_states and check it
>> only in is_state_visited().
>> By using this parent-state chain to detect loops, we open up the
>> possibility that in future we could determine a loop to be bounded and
>> thus accept it.
>> A few selftests had to be changed to ensure that all the insns walked
>> before reaching the back-edge are valid.
>>
> I still believe this needs to be done in an initial pass where we can
> build a proper CFG and analyze the loops to ensure we have loops that
> can be properly verified. Otherwise we end up trying to handle all the
> special cases later like the comment in patch 7/12 'three-jump trick'.
It's not so much that the 'three-jump trick' is a special case with special
code to handle it; more that it's the simplest construction I found for
defeating the case where you make a purely local decision and just look at
whether there's a progressing test in the current loop body. That & the
entire class of trickery it represents is dealt with by saying that once a
test has been determined to be bounded, it has to stay bounded on all
subsequent loop iterations. (I considered various relaxed versions of
this constraint but they're harder to reason about.)
I would be happier if I could write out a proof that the verifier does what
I think it does; but I'm not really sure how to formally define "what I
think it does"...
> So rather than try to handle all loops only handle "normal" loops. In
> an early CFG stage with DOM tree the algorithm for this is already
> used by gcc/clang so there is good precedent for this type of work.
>
> Without this we are open to other "clever" goto programs that create
> loops that we have to special case. Better to just reject all these
> classes of loops because most users will never need them.
If we want to be narrow (perhaps just initially), we could also just say
that the loop body can't contain any other conditional jumps. But that
would be awkward since, while users may not need loop-in-loop, they are
quite likely to need if-in-loop, and I don't yet quite have a solid
algorithm for disallowing only the former.
> See more notes in 7/12 patch, sorry I've been busy and not had a chance
> to push code to build cfg and dom tree yet.
That's fine, gives me more time to polish mine so that it looks like the
better option when your version does come out ;-)
> Also case analysis in patch description describing types of loops
> this handles (either here or in another patch) would help me understand
> at least.
Yes, I didn't really describe that very much, did I? Here's a précis:
>From a 'loop structure' point of view, this basically handles anything;
if the verifier walks it and arrives at an insn it's already visited,
then the loop handling fires. The current implementation does have
the restriction that the conditional branch that's responsible for loop
termination has to loop in the 'true' branch, i.e.
loop: /* do things */
jr cc, loop
rather than
loop: /* do things */
jr cc, break
ja loop
break: /* etc. */
but that's really just because only one case is handled so far in
is_loop_bounded(), and there's no reason the latter kind can't be
handled too.
Interestingly, if it didn't have a separate check for recursion, it
would also handle boundedly recursive calls, like in the test_verifier
test case "bounded recursion" added in patch #8. It _really_ doesn't
care about the 'structure' of the loop...
The other constraint at the moment is that only JLT is handled as the
back-edge jump, and the 'src' has to be a constant. Extending it to
cover other insns than JLT should be pretty trivial (just need an
is_reg_decreasing() function and signed equivalents, mainly); adding
support for variable src would be harder & probably unnecessary.
So currently it will handle loops like
loop: /* do things, including arithmetic on rX */
jr <, rX, 16, loop
or even like
loop: /* do things */
jr <, rX, 16, more
ja break
more: /* do more things */
ja loop /* or jr cc, loop; so long as this isn't sometimes-bounded */
break: /* etc. */
>From a C perspective, this should mean things like
int i;
for (i = 0; i < 10; i++)
/* do things */;
where that 'do things' can include all kinds of jumps, including
conditional ones (mostly), and can even have extra 'i++' statements.
This last is necessary for the (potential) use case of IPv6 options
parsing, which would look something like:
int nh_off;
for (nh_off = iph_off + 40;
nh_off < ARBITRARY_MAXIMUM && data + nh_off + 2 < data_end;
) {
u8 nexthdr = data[nh_off];
u16 exthdrlen = (data[nh_off + 1] + 1) * 8;
if (!is_option(nexthdr)) break;
nh_off += exthdrlen;
}
Note that it takes some value analysis (noting that exthdrlen >= 8)
to determine that nh_off is increasing; I think there is at least
*some* advantage in keeping all of that value analysis in one place
(the existing walker) rather than having some potentially duplicate
code trying to prove similar things on the dominance tree.
But if you (/maintainers) still think the static-analysis cfg/dom way
is more appropriate after having seen mine, then go ahead and do it
your way, I won't try to fight to have it done this way :-)
-Ed
Powered by blists - more mailing lists