[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89i+cLhi8t04=+3LYh_8qDQA3fYZmKyGBuqitX+==KOvLLQ@mail.gmail.com>
Date: Tue, 20 Aug 2024 15:36:47 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Takamitsu Iwai <takamitz@...zon.co.jp>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, David Ahern <dsahern@...nel.org>,
Kuniyuki Iwashima <kuniyu@...zon.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v1 net 2/2] tcp: Don't recv() OOB twice.
On Tue, Aug 20, 2024 at 5:51 AM Takamitsu Iwai <takamitz@...zon.co.jp> wrote:
>
> commit 36893ef0b661 ("af_unix: Don't stop recv() at consumed ex-OOB
> skb.") finds a bug that TCP reads OOB which has been already recv()ed.
>
> This bug is caused because current TCP code does not have a process to
> check and skip consumed OOB data. So OOB exists until it is recv()ed
> even if it is already consumed through recv() with MSG_OOB option.
>
> We add code to check and skip consumed OOB when reading skbs.
>
> In this patch, we introduce urg_skb in tcp_sock to keep track of skbs
> containing consumed OOB. We make tcp_try_coalesce() avoid coalescing skb to
> urg_skb to locate OOB data at the last byte of urg_skb.
>
> I tried not to modify tcp_try_coalesce() by decrementing end_seq when
> OOB data is recv()ed. But this hack does not work when OOB data is at
> the middle of skb by coalescing OOB and normal skbs. Also, when the
> next OOB data comes in, we’ll lose the seq# of the consumed OOB to
> skip during the normal recv().
>
> Consequently, the code to prevent coalescing is now located within
> tcp_try_coalesce().
>
> This patch enables TCP to pass msg_oob selftests when removing
> tcp_incompliant braces in inline_oob_ahead_break and
> ex_oob_ahead_break tests.
>
> # RUN msg_oob.no_peek.ex_oob_ahead_break ...
> # OK msg_oob.no_peek.ex_oob_ahead_break
> ok 11 msg_oob.no_peek.ex_oob_ahead_break
> # RUN msg_oob.no_peek.inline_oob_ahead_break ...
> # OK msg_oob.no_peek.inline_oob_ahead_break
> ok 15 msg_oob.no_peek.inline_oob_ahead_break
>
> We will rewrite existing other code to use urg_skb and remove urg_data
> and urg_seq, which have the same functionality as urg_skb
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Takamitsu Iwai <takamitz@...zon.co.jp>
> ---
> include/linux/tcp.h | 1 +
> include/net/tcp.h | 3 ++-
> net/ipv4/tcp.c | 15 ++++++++++-----
> net/ipv4/tcp_input.c | 5 +++++
> tools/testing/selftests/net/af_unix/msg_oob.c | 10 ++--------
> 5 files changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 6a5e08b937b3..63234e8680e3 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -243,6 +243,7 @@ struct tcp_sock {
> struct minmax rtt_min;
> /* OOO segments go in this rbtree. Socket lock must be held. */
> struct rb_root out_of_order_queue;
> + struct sk_buff *urg_skb;
> u32 snd_ssthresh; /* Slow start size threshold */
> u8 recvmsg_inq : 1;/* Indicate # of bytes in queue upon recvmsg */
> __cacheline_group_end(tcp_sock_read_rx);
NACK, sorry, we will not change URG behavior, add yet another
dangerous dangling pointer.
We should not give users the impression this is maintained, useful,
and works as intended on various OS/kernels.
Powered by blists - more mailing lists