linux-kernel - Re: [PATCH bpf-next v6 1/3] bpf: Allow trusted pointers to be passed to KF_TRUSTED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221102204835.uxze44lottozywbj@apollo>
Date:   Thu, 3 Nov 2022 02:18:35 +0530
From:   Kumar Kartikeya Dwivedi <memxor@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     David Vernet <void@...ifault.com>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Song Liu <song@...nel.org>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Stanislav Fomichev <sdf@...gle.com>,
        Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Kernel Team <kernel-team@...com>, Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH bpf-next v6 1/3] bpf: Allow trusted pointers to be passed
 to KF_TRUSTED_ARGS kfuncs

On Wed, Nov 02, 2022 at 08:04:57AM IST, Alexei Starovoitov wrote:
> On Tue, Nov 1, 2022 at 6:01 PM Kumar Kartikeya Dwivedi <memxor@...il.com> wrote:
> >
> > On Wed, 2 Nov 2022 at 06:02, Alexei Starovoitov
> > <alexei.starovoitov@...il.com> wrote:
> > >
> > > On Wed, Nov 02, 2022 at 04:01:11AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > On Wed, 2 Nov 2022 at 03:06, David Vernet <void@...ifault.com> wrote:
> > > > >
> > > > > On Tue, Nov 01, 2022 at 01:22:39PM -0700, Alexei Starovoitov wrote:
> > > > > > On Tue, Nov 1, 2022 at 11:11 AM David Vernet <void@...ifault.com> wrote:
> > > > > > >
> > > > > > > > What kind of bpf prog will be able to pass 'struct nf_conn___init *' into these bpf_ct_* ?
> > > > > > > > We've introduced / vs nf_conf specifically to express the relationship
> > > > > > > > between allocated nf_conn and other nf_conn-s via different types.
> > > > > > > > Why is this not enough?
> > > > > > >
> > > > > > > Kumar should have more context here (he originally suggested this in
> > > > > > > [0]),
> > > > > >
> > > > > > Quoting:
> > > > > > "
> > > > > > Unfortunately a side effect of this change is that now since
> > > > > > PTR_TO_BTF_ID without ref_obj_id is considered trusted, the bpf_ct_*
> > > > > > functions would begin working with tp_btf args.
> > > > > > "
> > > > > > I couldn't find any tracepoint that has nf_conn___init as an argument.
> > > > > > The whole point of that new type was to return it to bpf prog,
> > > > > > so the verifier type matches it when it's passed into bpf_ct_*
> > > > > > in turn.
> > > > > > So I don't see a need for a new OWNED flag still.
> > > > > > If nf_conn___init is passed into tracepoint it's a bug and
> > > > > > we gotta fix it.
> > > > >
> > > > > Yep, this is what I'm seeing as well. I think you're right that
> > > > > KF_OWNED_ARGS is just strictly unnecessary and that creating wrapper
> > > > > types is the way to enable an ownership model like this.
> > > > >
> > > >
> > > > It's not just nf_conn___init. Some CT helpers also take nf_conn.
> > > > e.g. bpf_ct_change_timeout, bpf_ct_change_status.
> > > > Right now they are only allowed in XDP and TC programs, so the tracing
> > > > args part is not a problem _right now_.
> > >
> > > ... and it will be fine to use bpf_ct_change_timeout from tp_btf as well.
> > >
> > > > So currently it may not be possible to pass such a trusted but
> > > > ref_obj_id == 0 nf_conn to those helpers.
> > > > But based on changes unrelated to this, it may become possible in the
> > > > future to obtain such a trusted nf_conn pointer.
> > >
> > > From kfunc's pov trusted pointer means valid pointer.
> > > It doesn't need to be ref_obj_id refcounted from the verifier pov.
> > > It can be refcounted on the kernel side and it will be trusted.
> > > The code that calls trace_*() passes only trusted pointers into tp-s.
> > > If there is a tracepoint somewhere in the kernel that uses a volatile
> > > pointer to potentially uaf kernel object it's a bug that should be fixed.
> > >
> >
> > This is all fine. I'm asking you to distinguish between
> > trusted-not-refcounted and trusted-and-refcounted.
>
> That's not what you're asking :)
>
> > It is necessary for nf_conn, since the object can be reused if the
> > refcount is not held.
>
> of course. No one argues the opposite.
>
> > Some other CPU could be reusing the same memory and allocating a new
> > nf_conn on it while we change its status.
> > So it's not ok to call bpf_ct_change_timeout/status on trusted
> > nf_conn, but only on trusted+refcounted nf_conn.
>
> and here we start to disagree.
>
> > Trusted doesn't capture the difference between 'valid' vs 'valid and
> > owned by prog' anymore with the new definition
> > for PTR_TO_BTF_ID.
>
> and here we disagree completely.
> You're asking to distinguish refcnt++ done by the program
> and recognized by the verifier as ref_obj_id > 0 vs
> refcnt++ done by the kernel code before it calls into tracepoint.
> That's odd, right?
> I don't think people adding kfuncs should care what piece
> of code before kfunc did refcnt++.
>

I think we're talking past each other. Your point is that whenever PTR_TRUSTED
is set (tracepoint args, etc.) the kernel itself should ensure it holds the
refcount on behalf of the program. There shouldn't be a scenario where
PTR_TRUSTED nf_conn is obtained and the program or the kernel isn't holding
an active refcount during its use with helpers/kfuncs.

I was not making that assumption, but it's not wrong. Which is why I wanted to
limit their use to only when verifier is tracking a reference in the program
for the nf_conn returned from an acquire kfunc.

> > Yes, in most cases the tracepoints/tracing functions whitelisted will
> > have the caller ensure that,
> > but we should then allow trusted nf_conn in those hooks explicitly,
> > not implicitly by default everywhere.
> > Until then it should be restricted to ref_obj_id > 0 IMO as it is right now.
> >
> > > > It is a requirement of those kfuncs that the nf_conn has its refcount
> > > > held while they are called.
> > >
> > > and it will be. Just not by the verifier.
> > >
> > > > KF_TRUSTED_ARGS was encoding this requirement before, but it wouldn't anymore.
> > > > It seems better to me to keep that restriction instead of relaxing it,
> > > > if it is part of the contract.
> > >
> > > Disagree as explained above.
> > >
> > > > It is fine to not require people to dive into these details and just
> > > > use KF_TRUSTED_ARGS in general, but we need something to cover special
> > > > cases like these where the object is only stable while we hold an
> > > > active refcount, RCU protection is not enough against reuse.
> > >
> > > This is not related to RCU. Let's not mix RCU concerns in here.
> > > It's a different topic.
> > >
> >
> > What I meant is that in the normal case, usually objects aren't reused
> > while the RCU read lock is held.
> > In case of nf_conn, the refcount needs to be held to ensure that,
> > since it uses SLAB_TYPESAFE_BY_RCU.
> > This is why bpf_ct_lookup needs to bump the refcount and match the key
> > after that again, and cannot just return the looked up ct directly.
>
> bpf_ct_lookup needs to bump a refcnt?!
> bpf_skb_ct_lookup calls __bpf_nf_ct_lookup
> that calls nf_conntrack_find_get() that does
> the search and incs the refcnt in a generic kernel code.
> There is nothing bpf specific stuff here. bpf kfunc didn't
> add any special refcnt incs.
>
> There are no tracepoints in netfilter, so this discussion
> is all theoretical, but if there was then the code
> should have made sure that refcnt is held before
> passing nf_conn into tracepoint.
>

Right, if so, it makes sense. Thanks for explaining.

> > > > It could be 'expert only' __ref suffix on the nf_conn arg, or
> > > > KF_OWNED_ARGS, or something else.
> > >
> > > I'm still against that.
> > >
> >
> > I understand (and agree) that you don't want to complicate things further.
> > It's fine if you want to deal with this later when the above concern
> > materializes. But it will be yet another thing to keep in mind for the
> > future.
>
> I don't share the concern.
> With nf_conn there is none, right?
> But imagine there is only RCU protected pointer that
> is passed into tracepoint somewhere.
> The verifier doesn't recognize refcnt++ on it and ref_obj_id == 0
> the kernel code doesn't do refcnt++ either.
> But it's still safe and this arg should still be
> PTR_TO_BTF_ID | PTR_TRUSTED.
> The bpf prog can pass it further into kfunc that has KF_TRUSTED_ARGS.
> Since RCU is held before calling into tracepoint the bpf prog
> has to be non sleepable. Additional rcu_read_lock done by
> the prog is redundant, but doesn't hurt.
> When prog is calling kfunc the pointer is still valid and
> kfunc can safely operate on it assuming that object is not going away.
> That is the definition of KF_TRUSTED_ARGS from pov of kfunc.
> You documented it yourself :)
> "
> The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
> indicates that the all pointer arguments will always have a guaranteed lifetime,
> "

So based on the above, verifier should only mark nf_conn as PTR_TRUSTED in cases
where the kernel already holds the refcount on behalf of the program, otherwise
for the non-PTR_TRUSTED case program must have ref_obj_id > 0?

I.e. 'the guaranteed lifetime' for the nf_conn case also means the refcount is
held in context where program can obtain PTR_TRUSTED nf_conn pointer.

It will surely be true in the tracepoint case if there ever is one. Won't work
easily for fentry/fexit but you already said that PTR_TRUSTED shouldn't be set
in that case, atleast by default and without allowlisting those BTF IDs.