netdev - Re: [Patch bpf-next] tcp: fix sock skb accounting in tcp_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YtQ/Np8DZBJVFO3l@pop-os.localdomain>
Date:   Sun, 17 Jul 2022 09:56:22 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     netdev <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Cong Wang <cong.wang@...edance.com>,
        syzbot <syzbot+a0e6f8738b58f7654417@...kaller.appspotmail.com>,
        Stanislav Fomichev <sdf@...gle.com>,
        John Fastabend <john.fastabend@...il.com>
Subject: Re: [Patch bpf-next] tcp: fix sock skb accounting in tcp_read_skb()

On Tue, Jul 12, 2022 at 03:20:37PM +0200, Eric Dumazet wrote:
> On Sun, Jul 10, 2022 at 12:20 AM Cong Wang <xiyou.wangcong@...il.com> wrote:
> >
> > From: Cong Wang <cong.wang@...edance.com>
> >
> > Before commit 965b57b469a5 ("net: Introduce a new proto_ops
> > ->read_skb()"), skb was not dequeued from receive queue hence
> > when we close TCP socket skb can be just flushed synchronously.
> >
> > After this commit, we have to uncharge skb immediately after being
> > dequeued, otherwise it is still charged in the original sock. And we
> > still need to retain skb->sk, as eBPF programs may extract sock
> > information from skb->sk. Therefore, we have to call
> > skb_set_owner_sk_safe() here.
> >
> > Fixes: 965b57b469a5 ("net: Introduce a new proto_ops ->read_skb()")
> > Reported-and-tested-by: syzbot+a0e6f8738b58f7654417@...kaller.appspotmail.com
> > Tested-by: Stanislav Fomichev <sdf@...gle.com>
> > Cc: Eric Dumazet <edumazet@...gle.com>
> > Cc: John Fastabend <john.fastabend@...il.com>
> > Signed-off-by: Cong Wang <cong.wang@...edance.com>
> > ---
> >  net/ipv4/tcp.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 9d2fd3ced21b..c6b1effb2afd 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -1749,6 +1749,7 @@ int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
> >                 int used;
> >
> >                 __skb_unlink(skb, &sk->sk_receive_queue);
> > +               WARN_ON(!skb_set_owner_sk_safe(skb, sk));
> >                 used = recv_actor(sk, skb);
> >                 if (used <= 0) {
> >                         if (!copied)
> > --
> > 2.34.1
> >
> 
> I am reading tcp_read_skb(),it seems to have other bugs.
> I wonder why syzbot has not caught up yet.

As you mentioned this here I assume you suggest I should fix all bugs in
one patch? (I am fine either way in this case, only slightly prefer to fix
one bug in each patch for readability.)

> 
> It ignores the offset value from tcp_recv_skb(), this looks wrong to me.
> The reason tcp_read_sock() passes a @len parameter is that is it not
> skb->len, but (skb->len - offset)

If I understand tcp_recv_skb() correctly it only returns an offset for a
partial read of an skb. IOW, if we always read an entire skb at a time,
offset makes no sense here, right?

> 
> Also if recv_actor(sk, skb) returns 0, we probably still need to
> advance tp->copied_seq,
> for instance if skb had a pure FIN (and thus skb->len == 0), since you
> removed the skb from sk_receive_queue ?

Doesn't the following code handle this case?

        if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) {
                consume_skb(skb);
                ++seq;
                break;
        }

which is copied from tcp_read_sock()...

Thanks.