[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210702010455.3h4v5c4g7wx2aeth@kafai-mbp.dhcp.thefacebook.com>
Date: Thu, 1 Jul 2021 18:04:55 -0700
From: Martin KaFai Lau <kafai@...com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: <davem@...emloft.net>, <daniel@...earbox.net>, <andrii@...nel.org>,
<netdev@...r.kernel.org>, <bpf@...r.kernel.org>,
<kernel-team@...com>
Subject: Re: [PATCH v4 bpf-next 1/9] bpf: Introduce bpf timers.
On Thu, Jul 01, 2021 at 12:20:36PM -0700, Alexei Starovoitov wrote:
[ ... ]
> +BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn,
> + struct bpf_prog *, prog)
> +{
> + struct bpf_hrtimer *t;
> + struct bpf_prog *prev;
> + int ret = 0;
> +
> + if (in_nmi())
> + return -EOPNOTSUPP;
> + ____bpf_spin_lock(&timer->lock); /* including irqsave */
> + t = timer->timer;
> + if (!t) {
> + ret = -EINVAL;
> + goto out;
> + }
> + if (!atomic64_read(&(t->map->usercnt))) {
> + /* maps with timers must be either held by user space
> + * or pinned in bpffs. Otherwise timer might still be
> + * running even when bpf prog is detached and user space
> + * is gone, since map_release_uref won't ever be called.
> + */
> + ret = -EPERM;
> + goto out;
> + }
> + prev = t->prog;
> + if (prev != prog) {
> + if (prev)
> + /* Drop prev prog refcnt when swapping with new prog */
> + bpf_prog_put(prev);
> + /* Bump prog refcnt once. Every bpf_timer_set_callback()
> + * can pick different callback_fn-s within the same prog.
> + */
> + bpf_prog_inc(prog);
I think prog->aux->refcnt could be zero here when the bpf prog
is making its final run and before the rcu grace section ended,
so bpf_prog_inc_not_zero() should be used here.
> + t->prog = prog;
> + }
> + t->callback_fn = callback_fn;
> +out:
> + ____bpf_spin_unlock(&timer->lock); /* including irqrestore */
> + return ret;
> +}
> +
[ ... ]
> @@ -5837,6 +5906,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
> func_id != BPF_FUNC_map_pop_elem &&
> func_id != BPF_FUNC_map_peek_elem &&
> func_id != BPF_FUNC_for_each_map_elem &&
> + func_id != BPF_FUNC_timer_init &&
> + func_id != BPF_FUNC_timer_set_callback &&
It seems checking the posion map_ptr_state is not needed.
Is this change needed?
[ ... ]
> @@ -12584,6 +12662,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
> continue;
> }
>
> + if (insn->imm == BPF_FUNC_timer_set_callback) {
> + /* There is no need to do:
> + * aux = &env->insn_aux_data[i + delta];
> + * if (bpf_map_ptr_poisoned(aux)) return -EINVAL;
> + * for bpf_timer_set_callback(). If the same callback_fn is shared
> + * by different timers in different maps the poisoned check
> + * will return false positive.
> + *
> + * The verifier will process callback_fn as many times as necessary
> + * with different maps and the register states prepared by
> + * set_timer_callback_state will be accurate.
> + *
> + * The following use case is valid:
> + * map1 is shared by prog1, prog2, prog3.
> + * prog1 calls bpf_timer_init for some map1 elements
> + * prog2 calls bpf_timer_set_callback for some map1 elements.
> + * Those that were not bpf_timer_init-ed will return -EINVAL.
> + * prog3 calls bpf_timer_start for some map1 elements.
> + * Those that were not both bpf_timer_init-ed and
> + * bpf_timer_set_callback-ed will return -EINVAL.
> + */
> + struct bpf_insn ld_addrs[2] = {
> + BPF_LD_IMM64(BPF_REG_3, (long)prog),
The "prog" pointer value is used here.
> + };
> +
> + insn_buf[0] = ld_addrs[0];
> + insn_buf[1] = ld_addrs[1];
> + insn_buf[2] = *insn;
> + cnt = 3;
> +
> + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
> + if (!new_prog)
> + return -ENOMEM;
> +
> + delta += cnt - 1;
> + env->prog = prog = new_prog;
After bpf_patch_insn_data(), a new prog may be allocated.
Is the above old "prog" pointer value updated accordingly?
I could have missed something.
> + insn = new_prog->insnsi + i + delta;
> + goto patch_call_imm;
> + }
> +
> /* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup
> * and other inlining handlers are currently limited to 64 bit
> * only.
Powered by blists - more mailing lists