netdev - Re: [PATCH v4 bpf-next 6/6] bpf: Refactor RCU enforcement in the verifier.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230302212344.snafoop5hytngskk@MacBook-Pro-6.local>
Date:   Thu, 2 Mar 2023 13:23:44 -0800
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     David Vernet <void@...ifault.com>
Cc:     davem@...emloft.net, daniel@...earbox.net, andrii@...nel.org,
        martin.lau@...nel.org, davemarchevsky@...a.com, tj@...nel.org,
        memxor@...il.com, netdev@...r.kernel.org, bpf@...r.kernel.org,
        kernel-team@...com
Subject: Re: [PATCH v4 bpf-next 6/6] bpf: Refactor RCU enforcement in the
 verifier.

On Wed, Mar 01, 2023 at 10:05:19PM -0600, David Vernet wrote:
> >  
> > @@ -6373,7 +6382,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
> >  			stype = btf_type_skip_modifiers(btf, mtype->type, &id);
> >  			if (btf_type_is_struct(stype)) {
> >  				*next_btf_id = id;
> > -				*flag = tmp_flag;
> > +				*flag |= tmp_flag;
> 
> Now that this function doesn't do a full write of the variable, the
> semantics have changed such that the caller has to initialize the
> variable on behalf of bpf_struct_walk(). This makes callers such as
> btf_struct_ids_match() (line 6503) have random crud in the pointer.
> Doesn't really matter for that case because the variable isn't used
> anyways, but it seems slightly less brittle to initialize *flag to 0 at
> the beginning of the function to avoid requiring the caller to
> initialize it. Wdyt?

Good idea. Fixed.

> > +BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_TRUSTED_ARGS | KF_RCU)
> > +BTF_ID_FLAGS(func, bpf_cpumask_any, KF_TRUSTED_ARGS | KF_RCU)
> > +BTF_ID_FLAGS(func, bpf_cpumask_any_and, KF_TRUSTED_ARGS | KF_RCU)
> 
> It's functionally the same, but could you please remove KF_TRUSTED_ARGS
> given that it's accepted for KF_RCU? We should ideally be removing
> KF_TRUSTED_ARGS altogether soon(ish) anyways, so might as well do it
> here.

done.

> > +		if (type_is_trusted(env, reg, off)) {
> > +			flag |= PTR_TRUSTED;
> > +		} else if (in_rcu_cs(env) && !type_may_be_null(reg->type)) {
> > +			if (type_is_rcu(env, reg, off)) {
> > +				/* ignore __rcu tag and mark it MEM_RCU */
> > +				flag |= MEM_RCU;
> > +			} else if (flag & MEM_RCU) {
> > +				/* __rcu tagged pointers can be NULL */
> > +				flag |= PTR_MAYBE_NULL;
> 
> I'm not quite understanding the distinction between manually-specified
> RCU-safe types being non-nullable, vs. __rcu pointers being nullable.
> Aren't they functionally the exact same thing, with the exception being
> that gcc doesn't support __rcu, so we've decided to instead manually
> specify them for some types that we know we need until __rcu is the
> default mechanism?  If the plan is to remove these macros once gcc
> supports __rcu, this could break some programs that are expecting the
> fields to be non-NULL, no?

BTF_TYPE_SAFE_RCU is a workaround for now.
We can make it exactly like __rcu, but it would split
the natural dereference of task->cgroups->dfl_cgrp into
two derefs with extra !=NULL check in-between which is ugly and unnecessary.

> I see why we're doing this in the interim -- task->cgroups,
> css->dfl_cgrp, task->cpus_ptr, etc can never be NULL. The problem is
> that I think those are implementation details that are separate from the
> pointers being RCU safe. This seems rather like we need a separate
> non-nullable tag, or something to that effect.

Right. It is certainly an implementation detail.
We'd need a new __not_null_mostly tag or __not_null_after_init.
(similar to __read_mostly and __ro_after_init).
Where non-null property is true when bpf get to see these structures.

The current allowlist is incomplete and far from perfect.
I suspect we'd need to add a bunch more during this release cycle.
This patch is aggressive in deprecation of old ptr_to_btf_id.
Some breakage is expected. Hence the timing to do it right now
at the beginning of the cycle.

> >  		flag &= ~PTR_TRUSTED;
> 
> Do you know what else is left for us to fix to be able to just set
> PTR_UNTRUSTED here?

All "ctx->" derefs. check_ctx_access() returns old school PTR_TO_BTF_ID.
We can probably mark all of them as trusted, but need to audit a lot of code.
I've also played with forcing helpers with ARG_PTR_TO_BTF_ID to be trusted,
but still too much selftest breakage to even look at.

The patch also has:
+                       if (BTF_INFO_KIND(mtype->info) == BTF_KIND_UNION &&
+                           btf_type_vlen(mtype) != 1)
+                               /*
+                                * walking unions yields untrusted pointers
+                                * with exception of __bpf_md_ptr and other
+                                * unions with a single member
+                                */
+                               *flag |= PTR_UNTRUSTED;
this is in particular to make skb->dev deref to return untrusted.
In this past we allowed skb->dev->ifindex to go via PTR_TO_BTF_ID and PROBE_MEM.
It's safe, but not clean. And we have no safe way to get trusted 'dev' to pass into helpers.
It's time to clean this all up as well, but it will require rearranging fields in sk_buff.
Lots of work ahead.