netdev - Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lyimveo0ul.fsf@netronome.com>
Date:   Tue, 16 Apr 2019 08:39:30 +0100
From:   Jiong Wang <jiong.wang@...ronome.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Jiong Wang <jiong.wang@...ronome.com>, daniel@...earbox.net,
        bpf@...r.kernel.org, netdev@...r.kernel.org,
        oss-drivers@...ronome.com
Subject: Re: [PATCH v4 bpf-next 01/15] bpf: split read liveness into REG_LIVE_READ64 and REG_LIVE_READ32


Alexei Starovoitov writes:

> On Mon, Apr 15, 2019 at 06:26:11PM +0100, Jiong Wang wrote:
>> Register liveness infrastructure doesn't track register read width at the
>> moment, while the width information will be needed for the later 32-bit
>> safety analysis pass.
>> 
>> This patch take the first step to split read liveness into REG_LIVE_READ64
>> and REG_LIVE_READ32.
>> 
>> Liveness propagation code are updated accordingly. They are taught to
>> understand how to propagate REG_LIVE_READ64 and REG_LIVE_READ32 at the same
>> propagation iteration. For example, "mark_reg_read" now propagate "flags"
>> which could be multiple read bits instead of the single REG_LIVE_READ64.
>> 
>> A write still screen off all width of reads.
>> 
>> Signed-off-by: Jiong Wang <jiong.wang@...ronome.com>
>> ---
>>  include/linux/bpf_verifier.h |   8 +--
>>  kernel/bpf/verifier.c        | 119 +++++++++++++++++++++++++++++++++++++++----
>>  2 files changed, 115 insertions(+), 12 deletions(-)
>> 
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index b3ab61f..fba0ebb 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -36,9 +36,11 @@
>>   */
>>  enum bpf_reg_liveness {
>>  	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
>> -	REG_LIVE_READ, /* reg was read, so we're sensitive to initial value */
>> -	REG_LIVE_WRITTEN, /* reg was written first, screening off later reads */
>> -	REG_LIVE_DONE = 4, /* liveness won't be updating this register anymore */
>> +	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
>> +	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
>> +	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
>> +	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
>> +	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
>>  };
>>  
>>  struct bpf_reg_state {
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index c722015..5784b279 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -1135,7 +1135,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
>>   */
>>  static int mark_reg_read(struct bpf_verifier_env *env,
>>  			 const struct bpf_reg_state *state,
>> -			 struct bpf_reg_state *parent)
>> +			 struct bpf_reg_state *parent, u8 flags)
>>  {
>>  	bool writes = parent == state->parent; /* Observe write marks */
>>  	int cnt = 0;
>> @@ -1150,17 +1150,23 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>>  				parent->var_off.value, parent->off);
>>  			return -EFAULT;
>>  		}
>> -		if (parent->live & REG_LIVE_READ)
>> +		/* The first condition is much more likely to be true than the
>> +		 * second, make it checked first.
>> +		 */
>> +		if ((parent->live & REG_LIVE_READ) == flags ||
>> +		    parent->live & REG_LIVE_READ64)
>>  			/* The parentage chain never changes and
>>  			 * this parent was already marked as LIVE_READ.
>>  			 * There is no need to keep walking the chain again and
>>  			 * keep re-marking all parents as LIVE_READ.
>>  			 * This case happens when the same register is read
>>  			 * multiple times without writes into it in-between.
>> +			 * Also, if parent has REG_LIVE_READ64 set, then no need
>> +			 * to set the weak REG_LIVE_READ32.
>>  			 */
>>  			break;
>>  		/* ... then we depend on parent's value */
>> -		parent->live |= REG_LIVE_READ;
>> +		parent->live |= flags;
>>  		state = parent;
>>  		parent = state->parent;
>>  		writes = true;
>> @@ -1172,12 +1178,95 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>>  	return 0;
>>  }
>>  
>> +/* This function is supposed to be used by the following 32-bit optimization
>> + * code only. It returns TRUE if the source or destination register operates
>> + * on 64-bit, otherwise return FALSE.
>> + */
>> +static bool is_reg64(struct bpf_insn *insn, u32 regno,
>> +		     struct bpf_reg_state *reg, enum reg_arg_type t)
>> +{
>> +	u8 code, class, op;
>> +
>
> why is it called for case when t != SRC_OP ?
> this patch is using the return value only in t == SRC_OP case
> and other patches don't use is_reg64() at all.

It is used for case when t == DST*, in patch 2/15, please search "rw64" in
that patch.

And "is_reg64" aims to return TRUE if it is 64-bit read when T == SRC_OP,
and return TRUE if it is 64-bit write when T = DST*. 

>> +	code = insn->code;
>> +	class = BPF_CLASS(code);
>> +	op = BPF_OP(code);
>> +	if (class == BPF_JMP) {
>> +		/* BPF_EXIT will reach here because of return value readability
>> +		 * test for "main" which has s32 return value.
>> +		 */
>> +		if (op == BPF_EXIT)
>> +			return false;
>
> That's not incorrect. bpf2bpf calls return 64-bit values.

bpf2bpf calls has all instructions exposed to insn walker, so the data-flow
is naturally tracked. For example:
  
callee:
  w0 = w2
  exit

caller:
  call callee2
  r2 = r0
  
insn walker should have marked REG_0 is a sub-register define in callee's
frame, and such marker is naturally propagetd back to caller's frame inside
"prepare_func_exit" which is doing:

  /* return to the caller whatever r0 had in the callee */                 
  caller->regs[BPF_REG_0] = *r0;

This copies parentage chain, and also copies the sub-register definition
information as we have merged it into reg state.

> bpf abi is such that all bpf progs return 64-bit values.
> Historically we truncate to 32-bit in BPF_PROG_RUN,
> but some future bpf hook might use all 64 bits of return value.

hmm, was thinking bpf main always return 32-bit, so safe to return FALSE
here. If we could have 64-bit main return, then perhaps for main, always do
zero-extension on r0 if it comes from sub-register def, this seems a little
bit unnecessary, given there is prog_type available during the check, using
which we can known whether whether return value is 64-bit or not.

thoughts?

>> +		if (op == BPF_CALL) {
>> +			/* BPF to BPF call will reach here because of marking
>> +			 * caller saved clobber with DST_OP_NO_MARK for which we
>> +			 * don't care the register def because they are anyway
>> +			 * marked as NOT_INIT already.
>> +			 */
>
> the comment doesn't seem to match the code. why return anything here?

It is to handle the case like:

  check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);

which is called inside "check_func_call" for call insn. I think for caller
saved register, return anything is fine. They must have new def inside
callee before used. And when returned to caller, they must be restored
before used otherwise the prog will be rejected due to their status is
marked as NOT_INIT.

So, for all cases, they are tracked insn walker accurately.

> The return value won't be used anyway.
>
> If is_reg64() is inside check_reg_arg() under if (t == SRC_OP)
> all these corner cases wouldn't cause review headaches and can be converted
> to WARN_ONCE.
>
> What am I missing?

As explained, is_reg64 could be used by both t == SRC_OP and DST*. And is
will be called for instructions with implicit register usage for example
CALL, and we need to return access width for those implicitly used register
here as well.

Make sense?

Regards,
Jiong