netdev - Re: [PATCH v3 bpf-next 1/2] bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190309052825.vykt2vgkk2w4rzby@kafai-mbp.dhcp.thefacebook.com>
Date:   Sat, 9 Mar 2019 05:28:28 +0000
From:   Martin Lau <kafai@...com>
To:     Lorenz Bauer <lmb@...udflare.com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Alexei Starovoitov <ast@...com>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v3 bpf-next 1/2] bpf: Fix bpf_tcp_sock and bpf_sk_fullsock
 issue related to bpf_sk_release

On Wed, Mar 06, 2019 at 03:59:40PM +0000, Lorenz Bauer wrote:
> On Mon, 4 Mar 2019 at 17:43, Martin Lau <kafai@...com> wrote:
> >
> > On Mon, Mar 04, 2019 at 10:33:46AM +0100, Daniel Borkmann wrote:
> > > On 03/02/2019 09:21 PM, Martin Lau wrote:
> > > > On Sat, Mar 02, 2019 at 10:03:03AM -0800, Alexei Starovoitov wrote:
> > > >> On Sat, Mar 02, 2019 at 08:10:10AM -0800, Martin KaFai Lau wrote:
> > > >>> Lorenz Bauer [thanks!] reported that a ptr returned by bpf_tcp_sock(sk)
> > > >>> can still be accessed after bpf_sk_release(sk).
> > > >>> Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue.
> > > >>> This patch addresses them together.
> > > >>>
> > > >>> A simple reproducer looks like this:
> > > >>>
> > > >>> sk = bpf_sk_lookup_tcp();
> > > >>> /* if (!sk) ... */
> > > >>> tp = bpf_tcp_sock(sk);
> > > >>> /* if (!tp) ... */
> > > >>> bpf_sk_release(sk);
> > > >>> snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */
> > > >>>
> > > >>> The problem is the verifier did not scrub the register's states of
> > > >>> the tcp_sock ptr (tp) after bpf_sk_release(sk).
> > > >>>
> > > >>> [ Note that when calling bpf_tcp_sock(sk), the sk is not always
> > > >>>   refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works
> > > >>>   fine for this case. ]
> > > >>>
> > > >>> Currently, the verifier does not track if a helper's return ptr (in REG_0)
> > > >>> is "carry"-ing one of its argument's refcount status. To carry this info,
> > > >>> the reg1->id needs to be stored in reg0.  The reg0->id has already
> > > >>> been used for NULL checking purpose.  Hence, a new "refcount_id"
> > > >>> is needed in "struct bpf_reg_state".
> > > >>>
> > > >>> With refcount_id, when bpf_sk_release(sk) is called, the verifier can scrub
> > > >>> all reg states which has a refcount_id match.  It is done with the changes
> > > >>> in release_reg_references().
> > > >>>
> > > >>> When acquiring and releasing a refcount, the reg->id is still used.
> > > >>> Hence, we cannot do "bpf_sk_release(tp)" in the above reproducer
> > > >>> example.
> > > >>
> > > >> I think the choice of returning listener full sock from req sock
> > > >> in sk_to_full_sk() was a wrong one.
> > > >> It seems better to make semantics of bpf_tcp_sock() and bpf_sk_fullsock() as
> > > >> always type cast or null.
> > > >> And have a separate helper for req socket that returns inet_reqsk(sk)->rsk_listener.
> > > >>
> > > >> Then it will be ok to call bpf_sk_release(tp) when tp came from bpf_sk_lookup_tcp.
> > > >> The verifier will know that it's the case because its ID will be in acquired_refs.
> > > >>
> > > >> The additional refcount_id won't be necessary.
> > > >> bpf_sk_fullsock() and bpf_tcp_sock() will not call sk_to_full_sk
> > > >> and the verifier will be copying reg1->id into reg0->id.
> > > >>
> > > >> In release_reference() the verifier will do
> > > >>   if (regs[i].id == id)
> > > >>     mark_reg_unknown(env, regs, i);
> > > >> for all socket types.
> > > >>
> > > >> release_reference_state() will stay as-is.
> > > >>
> > > >> imo such logic will be easier to follow.
> > > >>
> > > >> This implicit sk_to_full_sk() makes the whole thing much harder for the verifier
> > > >> and for the bpf program writers.
> > > >>
> > > >> The new bpf_get_listener_sock(sk) doesn't have to copy ID from reg1 to reg0
> > > >> since req socket will not be returned from bpf_sk_lookup_tcp and its ID
> > > >> will not be stored in acuired_refs.
> > > >>
> > > >> Does it make sense ?
> > > > I like this idea.  Many thanks for thinking it through!
> > > >
> > > > Allowing bpf_sk_release(tp), no need to call bpf_sk_release() on ptr
> > > > returned from bpf_get_listener_sock(sk) and keep one reg->id.
> > > >
> > > > I think it should work.  I will rework the patches.
> > >
> > > Agree, makes sense, that seems much better fix.
> > While I was working on this change, based on the code, one issue I saw is:
> >
> > if the bpf prog does this:
> >
> > sk = bpf_sk_lookup_tcp();
> > /* if (!sk) ... */
> > fullsock = bpf_sk_fullsock(sk);
> > if (!fullsock) {
> >         bpf_sk_release(sk); /* Fail. sk_reg->id not found in ref state */
> >         return 0;
> > }
> >
> > The bpf_sk_release(sk) failed because the reference state has already
> > been released by "release_reference_state(state, fullsock_reg->id)" during
> > "if (!fullsock) /* handled by mark_ptr_or_null_regs(is_null == true) */"
> > Logically, I think bpf_sk_release(sk) should not fail regardless of
> > bpf_sk_fullsock() doing sk_to_full_sk() or not.
> >
> > bpf_sk_fullsock() could disallow PTR_TO_SOCKET or PTR_TO_TCP_SOCK but that
> > would be weird.
> >
> > I think we still need two id.  May be rename the refcount_id proposed in
> > this patch to ref_obj_id which is the original refcounted object id.
> >
> > If the sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock(),
> > these two helpers become a simple cast (i.e. either return the same pointer
> > or NULL).  Then bpf_sk_release(fullsock) and bpf_sk_release(tp) could work:
> >
> > - When is_null == true, release_reference_state(state, reg->id) is called.
> 
> If I understand correctly, this works because we never
> acquire_reference() for tp/ fullsock,
> making this a no-op?
Sorry for the late reply.

Correct. Those two helpers do not take ref, so
release_reference_state() will not be called.

> 
> > - During bpf_sk_release(fullsock), release_reference(env, reg->ref_obj_id)
> >   is called so that sk, fullsock and tp with the same ref_obj_id will
> >   be mark_reg_unknown().
> 
> To clarify, the following states are possible:
> * id == 0, ref_obj_id == 0: not a pointer / reference
> * id != 0, reg_obj_id == 0: a reference which didn't have
> acquire_reference() called
> * id != 0, reg_obj_id != 0: a reference which had acquire_reference() called
> * id == 0, reg_obj_id: illegal
In this 2 id(s) approach, I would think of it in this way.
id and ref_obj_id are for two different purposes.  One for
null checking and one for reference tracking.  Whenever
its own purpose is served, it can be set to 0.

Regardless, I am working on another idea that does not
require two id(s) in bpf_reg_state.  I will give
an update on this.