linux-kernel - Re: [PATCH net] tcp: reset tp->sacked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <fd9abfe4-962e-ceef-5ab8-29e654303343@virtuozzo.com>
Date:   Wed, 26 Oct 2022 20:59:18 +0200
From:   "Denis V. Lunev" <den@...tuozzo.com>
To:     Eric Dumazet <edumazet@...gle.com>, Lu Wei <luwei32@...wei.com>
Cc:     davem@...emloft.net, yoshfuji@...ux-ipv6.org, dsahern@...nel.org,
        kuba@...nel.org, pabeni@...hat.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        "Pavel Emelianov (Gmail)" <ovzxemul@...il.com>, avagin@...il.com,
        Pavel Tikhomirov <ptikhomirov@...tuozzo.com>
Subject: Re: [PATCH net] tcp: reset tp->sacked_out when sack is enabled

On 10/26/22 16:33, Eric Dumazet wrote:
> On Wed, Oct 26, 2022 at 7:30 AM Eric Dumazet <edumazet@...gle.com> wrote:
>> On Wed, Oct 26, 2022 at 7:12 AM Lu Wei <luwei32@...wei.com> wrote:
>>> The meaning of tp->sacked_out depends on whether sack is enabled
>>> or not. If setsockopt is called to enable sack_ok via
>>> tcp_repair_options_est(), tp->sacked_out should be cleared, or it
>>> will trigger warning in tcp_verify_left_out as follows:
>>>
>>> ============================================
>>> WARNING: CPU: 8 PID: 0 at net/ipv4/tcp_input.c:2132
>>> tcp_timeout_mark_lost+0x154/0x160
>>> tcp_enter_loss+0x2b/0x290
>>> tcp_retransmit_timer+0x50b/0x640
>>> tcp_write_timer_handler+0x1c8/0x340
>>> tcp_write_timer+0xe5/0x140
>>> call_timer_fn+0x3a/0x1b0
>>> __run_timers.part.0+0x1bf/0x2d0
>>> run_timer_softirq+0x43/0xb0
>>> __do_softirq+0xfd/0x373
>>> __irq_exit_rcu+0xf6/0x140
>>>
>>> This warning occurs in several steps:
>>> Step1. If sack is not enabled, when server receives dup-ack,
>>>         it calls tcp_add_reno_sack() to increase tp->sacked_out.
>>>
>>> Step2. Setsockopt() is called to enable sack
>>>
>>> Step3. The retransmit timer expires, it calls tcp_timeout_mark_lost()
>>>         to increase tp->lost_out but not clear tp->sacked_out because
>>>         sack is enabled and tcp_is_reno() is false.
>>>
>>> So tp->left_out is increased repeatly in Step1 and Step3 and it is
>>> greater than tp->packets_out and trigger the warning. In function
>>> tcp_timeout_mark_lost(), tp->sacked_out will be cleared if Step2 not
>>> happen and the warning will not be triggered. So this patch clears
>>> tp->sacked_out in tcp_repair_options_est().
>>>
>>> Fixes: b139ba4e90dc ("tcp: Repair connection-time negotiated parameters")
>>> Signed-off-by: Lu Wei <luwei32@...wei.com>
>>> ---
>>>   net/ipv4/tcp.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>>> index ef14efa1fb70..188d5c0e440f 100644
>>> --- a/net/ipv4/tcp.c
>>> +++ b/net/ipv4/tcp.c
>>> @@ -3282,6 +3282,9 @@ static int tcp_repair_options_est(struct sock *sk, sockptr_t optbuf,
>>>                          if (opt.opt_val != 0)
>>>                                  return -EINVAL;
>>>
>>> +                       if (tcp_is_reno(tp))
>>> +                               tp->sacked_out = 0;
>>> +
>>>                          tp->rx_opt.sack_ok |= TCP_SACK_SEEN;
>>>                          break;
>>>                  case TCPOPT_TIMESTAMP:
>>> --
>>> 2.31.1
>>>
>> Hmm, I am not sure this is the right fix.
>>
>> Probably TCP_REPAIR_OPTIONS should not be allowed if data has already been sent.
>>
>> Pavel, what do you think ?
> Routing to Denis V. Lunev <den@...nvz.org>, because Pavel's address no
> longer works.
>
> Thanks !
Hi, guys!

This code is used in CRIU. I have added CRIU maintainers
Andrey Vagin, Pavel Tikhomirov and new address of Pavel
Emelyanov to CC list.

Here is the quote from Pavel Tikhomirov on the topic.
"We do setsockopt with TCP_REPAIR_OPTIONS in CRIU just
after calling connect to the socket here
https://github.com/checkpoint-restore/criu/blob/18c6426eaeebc5fe7d0f9ca0acb592a3ec828b0c/soccr/soccr.c#L566 

and before libsoccr_restore_queue.

So it seems there should be no data sent in this socket at
the moment, so I believe it is safe to prohibit
TCP_REPAIR_OPTIONS if data was already sent.

Though I'd recomend running some CRIU tests after this
change just to be sure that we don't break it.
E.g.: "zdtm/static/socket-tcp*" or just
"./test/zdtm.py run -a --keep-going --ignore-taint".

Thank you in advance,
     Den