[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fc39c72c-3668-80b9-1a32-a6283e9d3a97@huawei.com>
Date: Thu, 27 Oct 2022 21:32:36 +0800
From: "luwei (O)" <luwei32@...wei.com>
To: Eric Dumazet <edumazet@...gle.com>,
Pavel Emelyanov <ovzxemul@...il.com>
CC: <davem@...emloft.net>, <yoshfuji@...ux-ipv6.org>,
<dsahern@...nel.org>, <kuba@...nel.org>, <pabeni@...hat.com>,
Pavel Emelyanov <xemul@...allels.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
<den@...tuozzo.com>
Subject: Re: [PATCH net v2] tcp: reset tp->sacked_out when sack is enabled
在 2022/10/27 7:18 PM, Eric Dumazet 写道:
> On Thu, Oct 27, 2022 at 4:14 AM Pavel Emelyanov <ovzxemul@...il.com> wrote:
>>
>>
>> чт, 27 окт. 2022 г., 14:08 Eric Dumazet <edumazet@...gle.com>:
>>> On Thu, Oct 27, 2022 at 3:45 AM Lu Wei <luwei32@...wei.com> wrote:
>>>> If setsockopt with option name of TCP_REPAIR_OPTIONS and opt_code
>>>> of TCPOPT_SACK_PERM is called to enable sack after data is sent
>>>> and before data is acked, it will trigger a warning in function
>>>> tcp_verify_left_out() as follows:
>>>>
>>>> ============================================
>>>> WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132
>>>> tcp_timeout_mark_lost+0x154/0x160
>>>> tcp_enter_loss+0x2b/0x290
>>>> tcp_retransmit_timer+0x50b/0x640
>>>> tcp_write_timer_handler+0x1c8/0x340
>>>> tcp_write_timer+0xe5/0x140
>>>> call_timer_fn+0x3a/0x1b0
>>>> __run_timers.part.0+0x1bf/0x2d0
>>>> run_timer_softirq+0x43/0xb0
>>>> __do_softirq+0xfd/0x373
>>>> __irq_exit_rcu+0xf6/0x140
>>>>
>>>> This warning occurs in several steps:
>>>> Step1. If sack is not enabled, when server receives dup-ack,
>>>> it calls tcp_add_reno_sack() to increase tp->sacked_out.
>>>>
>>>> Step2. Setsockopt() is called to enable sack
>>>>
>>>> Step3. The retransmit timer expires, it calls tcp_timeout_mark_lost()
>>>> to increase tp->lost_out but not clear tp->sacked_out because
>>>> sack is enabled and tcp_is_reno() is false.
>>>>
>>>> So tp->left_out is increased repeatly in Step1 and Step3 and it is
>>>> greater than tp->packets_out and trigger the warning. In function
>>>> tcp_timeout_mark_lost(), tp->sacked_out will be cleared if Step2 not
>>>> happen and the warning will not be triggered. As suggested by Denis
>>>> and Eric, TCP_REPAIR_OPTIONS should be prohibited if data was already
>>>> sent.
>>>>
>>>> socket-tcp tests in CRIU has been tested as follows:
>>>> $ sudo ./test/zdtm.py run -t zdtm/static/socket-tcp* --keep-going \
>>>> --ignore-taint
>>>>
>>>> socket-tcp* represent all socket-tcp tests in test/zdtm/static/.
>>>>
>>>> Fixes: b139ba4e90dc ("tcp: Repair connection-time negotiated parameters")
>>>> Signed-off-by: Lu Wei <luwei32@...wei.com>
>>>> ---
>>>> net/ipv4/tcp.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>>>> index ef14efa1fb70..ef876e70f7fe 100644
>>>> --- a/net/ipv4/tcp.c
>>>> +++ b/net/ipv4/tcp.c
>>>> @@ -3647,7 +3647,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
>>>> case TCP_REPAIR_OPTIONS:
>>>> if (!tp->repair)
>>>> err = -EINVAL;
>>>> - else if (sk->sk_state == TCP_ESTABLISHED)
>>>> + else if (sk->sk_state == TCP_ESTABLISHED && !tp->packets_out)
>>> You keep focusing on packets_out :/
>>>
>>> What I said was : TCP_REPAIR_OPTIONS must be denied if any packets
>>> have been sent (and possibly already ACK)
>>
>>
>> If repair mode is ON, why does socket send any packets? This shouldn't happen from my perspective.
repair mode is entered after packets are sent. the process is:
1. client sends packets
2. client reveives dup-ack and increace sacked_out
3. client enter repair mode and set TCP_SACK_SEEN
4. retransmit_timer expires and enter_loss and increace lost_out
5. tcp_left_out(tp->sacked_out + tp->lost_out) is greater than packets_out and triggers the warning
> Exactly.
>
> TCP_REPAIR is easily abused by fuzzers.
>
> We need to enforce sanity rules in the kernel, not assuming user space
> is following the CRIU way of doing things.
yes, the warning is caught by a fuzzers, please refer to the attachment.
>
>> -- Pavel
>>
>>> Looking at tp->packets_out alone is not sufficient.
>>>
>>>> err = tcp_repair_options_est(sk, optval, optlen);
>>>> else
>>>> err = -EPERM;
>>>> --
>>>> 2.31.1
>>>>
> .
--
Best Regards,
Lu Wei
View attachment "a652ddf168ffc4273a60f6321077e72a156966e8_151a247b080000.c" of type "text/plain" (45422 bytes)
Powered by blists - more mailing lists