[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <luget5z5ep2ikuwnkpddbdwl2yueb34nhqqms2hhij25guut4l@qm56ut744lz5>
Date: Wed, 18 Dec 2024 13:01:41 +0800
From: Jiayuan Chen <mrpre@....com>
To: John Fastabend <john.fastabend@...il.com>
Cc: bpf@...r.kernel.org, martin.lau@...ux.dev, ast@...nel.org,
edumazet@...gle.com, jakub@...udflare.com, davem@...emloft.net, dsahern@...nel.org,
kuba@...nel.org, pabeni@...hat.com, linux-kernel@...r.kernel.org, song@...nel.org,
andrii@...nel.org, mhal@...x.co, yonghong.song@...ux.dev, daniel@...earbox.net,
xiyou.wangcong@...il.com, horms@...nel.org
Subject: Re: [PATCH bpf v2 1/2] bpf: fix wrong copied_seq calculation
On Sun, Dec 15, 2024 at 07:32:01PM +0800, John Fastabend wrote:
> Jiayuan Chen wrote:
[...]
> > On Thu, Dec 12, 2024 at 05:36:15PM -0800, John Fastabend wrote:
> > [...]
> > >
> > >
> > > So here is what I believe the flow is,
> > >
> > > sk_psock_strp_data_ready
> > > -> str_data_ready
> > > -> strp_read_sock
> > > -> sock->ops->read_sock(.., strp_recv)
> > >
> > >
> > > We both have the same idea up to here. But then the proposed data_ready()
> > > call
> > >
> > > + while ((skb = skb_peek(&sk->sk_receive_queue)) != NULL) {
> > > + u8 tcp_flags;
> > > + int used;
> > > +
> > > + WARN_ON_ONCE(!skb_set_owner_sk_safe(skb, sk));
> > > + tcp_flags = TCP_SKB_CB(skb)->tcp_flags;
> > > + used = recv_actor(desc, skb, 0, skb->len);
> > >
> > > The recv_actor here is strp_recv() all good so far. But, because
> > > that skb is still on the sk_receive_queue() the TCP stack may
> > > at the same time do
> > >
> > > tcp_data_queue
> > > -> tcp_queue_rcv
> > > -> tail = skb_peek_tail(&sk->sk_receive_queue);
> > > tcp_try_coalesce(sk, tail, skb, fragstolen)
> > > -> skb_try_coalesce()
> > > ... skb->len += len
> > >
> > > So among other things you will have changed the skb->len and added some
> > > data to it. If this happens while you are running the recv actor we will
> > > eat the data by calling tcp_eat_recv_skb(). Any data added from the
> > > try_coalesce will just be dropped and never handled? The clone() from
> > > the strparser side doesn't help you the tcp_eat_recv_skb call will
> > > unlik the skb from the sk_receive_queue.
> > >
> > > I don't think you have any way to protect this at the moment.
> >
> > Thanks John Fastabend.
> >
> > It seems sk was always locked whenever data_ready called.
> >
> > '''
> > bh_lock_sock_nested(sk)
> > tcp_v4_do_rcv(sk)
> > |
> > |-> tcp_rcv_established
> > |-> tcp_queue_rcv
> > |-> tcp_try_coalesce
> > |
> > |-> tcp_rcv_state_process
> > |-> tcp_queue_rcv
> > |-> tcp_try_coalesce
> > |
> > |-> sk->sk_data_ready()
> >
> > bh_unlock_sock(sk)
> > '''
> >
> > other data_ready path:
> > '''
> > lock_sk(sk)
> > sk->sk_data_ready()
> > release_sock(sk)
> > '''
> >
> > I can not find any concurrency there.
>
> OK thanks, one more concern though. What if strp_recv thorws an ENOMEM
> error on the clone? Would we just drop the data then? This is problem
> not the expected behavior its already been ACKed.
>
> Thanks,
> John
Thank, I did miss ENOMEM error. I also realized that when an ENOMEM error
occurs, the strparser framework will replay the skb, so it is necessary to
record the offset read from the skb to avoid data duplication or loss.
Sorry for the slow response; it took quite some time to write test cases
and set up an environment to simulate ENOMEM. I will send the v3 patch.
Powered by blists - more mailing lists