[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190416162022.wsymfdriull5srvo@ast-mbp.dhcp.thefacebook.com>
Date: Tue, 16 Apr 2019 09:20:24 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Jiong Wang <jiong.wang@...ronome.com>
Cc: daniel@...earbox.net, bpf@...r.kernel.org, netdev@...r.kernel.org,
oss-drivers@...ronome.com
Subject: Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into
REG_LIVE_READ64 and REG_LIVE_READ32
On Tue, Apr 16, 2019 at 08:39:30AM +0100, Jiong Wang wrote:
>
> Alexei Starovoitov writes:
>
> > On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
> >> Register liveness infrastructure doesn't track register read width at the
> >> moment, while the width information will be needed for the later 32-bit
> >> safety analysis pass.
> >>
> >> This patch take the first step to split read liveness into REG_LIVE_READ64
> >> and REG_LIVE_READ32.
> >>
> >> Liveness propagation code are updated accordingly. They are taught to
> >> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
> >> propagation iteration. For example, "mark_reg_read" now propagate "flags"
> >> which could be multiple read bits instead of the single REG_LIVE_READ64.
> >>
> >> A write still screen off all width of reads.
> >>
> >> Signed-off-by: Jiong Wang <jiong.wang@...ronome.com>
> >> ---
> >> include/linux/bpf_verifier.h | 8 +--
> >> kernel/bpf/verifier.c | 119 +++++++++++++++++++++++++++++++++++++++----
> >> 2 files changed, 115 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> >> index b3ab61f..fba0ebb 100644
> >> --- a/include/linux/bpf_verifier.h
> >> +++ b/include/linux/bpf_verifier.h
> >> @@ -36,9 +36,11 @@
> >> */
> >> enum bpf_reg_liveness {
> >> REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
> >> - REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
> >> - REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
> >> - REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
> >> + REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
> >> + REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
> >> + REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
> >> + REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
> >> + REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
> >> };
> >>
> >> struct bpf_reg_state {
> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> >> index c722015..5784b279 100644
> >> --- a/kernel/bpf/verifier.c
> >> +++ b/kernel/bpf/verifier.c
> >> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
> >> */
> >> static int mark_reg_read(struct bpf_verifier_env *env,
> >> const struct bpf_reg_state *state,
> >> - struct bpf_reg_state *parent)
> >> + struct bpf_reg_state *parent, u8 flags)
> >> {
> >> bool writes = parent == state->parent; /* Observe write marks */
> >> int cnt = 0;
> >> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> >> parent->var_off.value, parent->off);
> >> return -EFAULT;
> >> }
> >> - if (parent->live & REG_LIVE_READ)
> >> + /* The first condition is much more likely to be true than the
> >> + * second, make it checked first.
> >> + */
> >> + if ((parent->live & REG_LIVE_READ) == flags ||
> >> + parent->live & REG_LIVE_READ64)
> >> /* The parentage chain never changes and
> >> * this parent was already marked as LIVE_READ.
> >> * There is no need to keep walking the chain again and
> >> * keep re-marking all parents as LIVE_READ.
> >> * This case happens when the same register is read
> >> * multiple times without writes into it in-between.
> >> + * Also, if parent has REG_LIVE_READ64 set, then no need
> >> + * to set the weak REG_LIVE_READ32.
> >> */
> >> break;
> >> /* ... then we depend on parent's value */
> >> - parent->live |= REG_LIVE_READ;
> >> + parent->live |= flags;
> >> state = parent;
> >> parent = state->parent;
> >> writes = true;
> >> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> >> return 0;
> >> }
> >>
> >> +/* This function is supposed to be used by the following 32-bit optimization
> >> + * code only. It returns TRUE if the source or destination register operates
> >> + * on 64-bit, otherwise return FALSE.
> >> + */
> >> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
> >> + struct bpf_reg_state *reg, enum reg_arg_type t)
> >> +{
> >> + u8 code, class, op;
> >> +
> >
> > why is it called for case when t != SRC_OP ?
> > this patch is using the return value only in t == SRC_OP case
> > and other patches don't use is_reg64() at all.
>
> It is used for case when t == DST*, in patch 2/15, please search "rw64" in
> that patch.
argh. such patch split didn't make it easy to review.
> And "is_reg64" aims to return TRUE if it is 64-bit read when T == SRC_OP,
> and return TRUE if it is 64-bit write when T = DST*.
>
> >> + code = insn->code;
> >> + class = BPF_CLASS(code);
> >> + op = BPF_OP(code);
> >> + if (class == BPF_JMP) {
> >> + /* BPF_EXIT will reach here because of return value readability
> >> + * test for "main" which has s32 return value.
> >> + */
> >> + if (op == BPF_EXIT)
> >> + return false;
> >
> > That's not incorrect. bpf2bpf calls return 64-bit values.
>
> bpf2bpf calls has all instructions exposed to insn walker, so the data-flow
> is naturally tracked. For example:
>
> callee:
> w0 = w2
> exit
So then it needs to be different code? Like read32_maybe ?
> caller:
> call callee2
> r2 = r0
>
> insn walker should have marked REG_0 is a sub-register define in callee's
> frame, and such marker is naturally propagetd back to caller's frame inside
> "prepare_func_exit" which is doing:
>
> /* return to the caller whatever r0 had in the callee */
> caller->regs[BPF_REG_0] = *r0;
>
> This copies parentage chain, and also copies the sub-register definition
> information as we have merged it into reg state.
>
> > bpf abi is such that all bpf progs return 64-bit values.
> > Historically we truncate to 32-bit in BPF_PROG_RUN,
> > but some future bpf hook might use all 64 bits of return value.
>
> hmm, was thinking bpf main always return 32-bit, so safe to return FALSE
> here. If we could have 64-bit main return, then perhaps for main, always do
> zero-extension on r0 if it comes from sub-register def, this seems a little
> bit unnecessary, given there is prog_type available during the check, using
> which we can known whether whether return value is 64-bit or not.
>
> thoughts?
I think it's safer to go with read64 for now and optimize later
if really necessary.
> >> + if (op == BPF_CALL) {
> >> + /* BPF to BPF call will reach here because of marking
> >> + * caller saved clobber with DST_OP_NO_MARK for which we
> >> + * don't care the register def because they are anyway
> >> + * marked as NOT_INIT already.
> >> + */
> >
> > the comment doesn't seem to match the code. why return anything here?
>
> It is to handle the case like:
>
> check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
>
> which is called inside "check_func_call" for call insn. I think for caller
> saved register, return anything is fine. They must have new def inside
> callee before used. And when returned to caller, they must be restored
> before used otherwise the prog will be rejected due to their status is
> marked as NOT_INIT.
but neither this patch nor patch 2 is using the return value.
>
> So, for all cases, they are tracked insn walker accurately.
>
> > The return value won't be used anyway.
> >
> > If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
> > all these corner cases wouldn't cause review headaches and can be converted
> > to WARN_ONCE.
> >
> > What am I missing?
>
> As explained, is_reg64 could be used by both t == SRC_OP and DST*. And is
> will be called for instructions with implicit register usage for example
> CALL, and we need to return access width for those implicitly used register
> here as well.
>
> Make sense?
Kinda. I'll take a look again. My first reaction is that patch 1 and 2
should be one patch or split differently.
Powered by blists - more mailing lists