netdev - Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181003155952.rsr4v5yvi44v62ld@ast-mbp.dhcp.thefacebook.com>
Date:   Wed, 3 Oct 2018 08:59:53 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Jiong Wang <jiong.wang@...ronome.com>
Cc:     Edward Cree <ecree@...arflare.com>, ast@...nel.org,
        daniel@...earbox.net, netdev@...r.kernel.org
Subject: Re: [RFC PATCH v2 bpf-next 0/2] verifier liveness simplification

On Wed, Oct 03, 2018 at 04:36:31PM +0100, Jiong Wang wrote:
> On 28/09/2018 14:36, Edward Cree wrote:
> > On 26/09/18 23:16, Jiong Wang wrote:
> >> On 22/08/2018 20:00, Edward Cree wrote:
> >>> In the future this idea may be extended to form use-def chains.
> >>
> >>   1. instruction level use->def chain
> >>
> >>      - new use->def chains for each instruction. one eBPF insn could have
> two
> >>        uses at maximum.
> > I was thinking of something a lot weaker/simpler, just making
> >     ld rX, rY
> >  copy rY.parent into rX.parent and not read-mark rY (whereas actual
> >  arithmetic, pointer deref etc. would still create read marks).
> 
> Thanks for the feedback Edward.
> 
> > But what you've described sounds interesting; perhaps it would also
> >  help later with loop-variable handling?
> 
> Haven't considered how to use this for loop-variable handling, guess you
> mean
> applying what I have described to your previous loop detection RFC? I will
> look
> into your RFC later.
> 
> At the moment the design of the use->def chain is mainly to optimize 32-bit
> code-gen. I was about to satisfied with a local implementation and to share
> it
> to ML for further discussion. However, when manually check the optimization
> result on testcase with medium size (~1000 eBPF insns) and proper complexity
> (make sure path prunes etc are triggered inside verifier), I found the
> code-gen
> doesn't meet my expectation.
> 
> For example, for the following sequence, insn at 25 should operate on
> full-64
> bit but I found it is marked as 32-bit safe.
> 
>   25:    r7 = 1
>   26:    if r4 > r8 goto +1200 <L>
>   27:    r1 = *(u8 *)(r1 + 0)
>   28:    r1 &= 15
>   29:    r7 = 1
>   ...
> 
> L:
>   1227:  r0 = r7
>   1228:  exit
> 
> As described at previous email, the algorithm assume all insns are 32-bit
> safe
> first, then start to insns back to "64-bit" if there is any 64-bit use found
> for a insn.
> 
> Insn 25 is defining r7 which is used at the 1227 where its value propagated
> to
> r0 and then r0 is implicitly used at insn 1228 as it is a exit from main
> function to external.
> 
> For above example, as we don't know the external use of r0 at 1228 (exit
> from
> main to external), so r0 is treated as 64-bit implicit use. The define is at
> 1227, so insn 1227 is marked as "64-bit". The "64-bit" attribute should
> propagate to source register operand through register move and arithmetic,
> so
> r7 at insn 1227 is a "64-bit" use and should make its definition
> instruction,
> insn 25, marked as "64-bit". This is my thinking of how insn 25 should be
> marked.

all makes sense to me.

> Now this hasn't happened. I am still debugging the root cause, but kind of
> feel
> "64-bit" attribute propagation is the issue, it seems to me it can't be
> nicely
> integrated into the existing register read/write propagation infrastructure.

may be share your patches that modify the liveness propagation?

> For
> example, for a slightly more complex sequence which is composed of three
> states:
> 
> State A
>   ...
>   10: r6 = *(u32 *)(r10 - 4)
>   11: r7 = *(u32 *)(r10 - 8)
>   12: *(u64 *)(r10 - 16) = r6
>   13: *(u64 *)(r10 - 24) = r7
> 
> State B
>   14: r6 += 1
>   15: r7 += r6
>   16: *(u32 *)(r10 - 28) = r7
> 
> State C
>   ...
>   17: r3 += r7
>   18: r4 = 1
>   19: *(u64 *)(r10 - 32) = r3
>   20: *(u64 *)(r10 - 40) = r4
> 
> State A is parent of state B which is parent of state C.
> 
> Inside state C, at insn 20, r4 is a 64-bit read/use, so its define at 18 is
> marked as "64-bit". There is no register source at 18, so "64-bit" attribute
> propagation is stopped.
> 
> Then at insn 19, r3 is a 64-bit read/use, so its define at 17 is marked as
> "64-bit" read/use. Insn 17 has two register sources, r3 and r7, they become
> "64-bit" now, and their definition should be marked as "64-bit".
> 
> Now if the definition of r3 or r7 comes from parent state, then the parent

... the definition of r3 _and_ r7 ...
both need to propagate up with your algo, right?

> state
> should receive a "REG_LIVE_READ64", this is necessary if later another path
> reaches state C and triggers prune path, for which case that path should
> know
> there is "64-bit" use inside state C on some registers, and should use this
> information to mark "64-bit" insn.
> 
> If the definition of r3 or r7 is still inside state C, we need to keep
> walking
> up the instruction sequences, and propagate "64-bit" attribute upward until
> it
> goes beyond the state C.
> 
> The above propagation logic is quite different from existing register
> read/write
> propagation.
> For the latter, a write just screen up all following read, and
> a
> read would propagate directly to its parent is there is not previous write,
> no
> instruction analysis is required.

correct.
with such algo REG_LIVE_WRITTEN shouldn't be screening the propagation.

I think the patches will discuss the algo.
Also I think the initial state of 'everything is 32-bit safe'
and make marks to enforce 64-bit-ness is a dangerous algorithmic choice.
Why not to start at a safer state where everything is 64-bit
and work backward to find out which ones can be 32-bit?
That will be safer algo in case there are issues with bit like
you described in above.