lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Dec 2017 21:16:19 -0800
From:   Steve Ibanez <sibanez@...nford.edu>
To:     Neal Cardwell <ncardwell@...gle.com>
Cc:     Eric Dumazet <edumazet@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>,
        Mohammad Alizadeh <alizadeh@...il.mit.edu>,
        Lawrence Brakmo <brakmo@...com>
Subject: Re: Linux ECN Handling

Hi Neal,

I started looking into this receiver ACKing issue today. Strangely,
when I tried adding printk statements at the top of the
tcp_v4_do_rcv(), tcp_rcv_established(), __tcp_ack_snd_check() and
tcp_send_delayed_ack() functions they were never executed on the
machine running the iperf3 server (i.e. the destination of the flows).
Maybe the iperf3 server is using its own TCP stack?

In any case, the ACKing problem is reproducible using just normal
iperf for which I do see my printk statements being executed. I can
now confirm that when the CWR marked packet (for which no ACK is sent)
arrives at the receiver, the __tcp_ack_snd_check() function is never
invoked; and hence neither is the tcp_send_delayed_ack() function.
Hopefully this helps narrow down where the issue might be? I started
adding some printk statements into the tcp_rcv_established() function,
but I'm not sure where the best places to look would be so I wanted to
ask your advice on this.

In case you're interested, I instrumented the __tcp_ack_snd_check()
function with the following printk statements:

@@ -5057,9 +5117,15 @@ static inline void tcp_data_snd_check(struct sock *sk)
 /*
  * Check if sending an ack is needed.
  */
-static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
+static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible,
const struct tcphdr *th)
 {
        struct tcp_sock *tp = tcp_sk(sk);
+        struct inet_sock *inet = inet_sk(sk);

            /* More than one full frame received... */
        if (((tp->rcv_nxt - tp->rcv_wup) > inet_csk(sk)->icsk_ack.rcv_mss &&
@@ -5071,21 +5137,31 @@ static void __tcp_ack_snd_check(struct sock
*sk, int ofo_possible)
            tcp_in_quickack_mode(sk) ||
            /* We have out of order data. */
            (ofo_possible && !RB_EMPTY_ROOT(&tp->out_of_order_queue))) {
+                // SI: Debugging TCP ECN handeling
+                if (sk->sk_family == AF_INET && th->cwr) {
+                        printk("tcp_debug: __tcp_ack_snd_check:
%pI4/%u CWR set and sending ACK now - rcv_nxt=%u\n",
+                                 &inet->inet_daddr,
ntohs(inet->inet_sport), tp->rcv_nxt);
+                 }
                /* Then ack it now */
                tcp_send_ack(sk);
        } else {
+                // SI: Debugging TCP ECN handeling
+                if (sk->sk_family == AF_INET && th->cwr) {
+                        printk("tcp_debug: __tcp_ack_snd_check:
%pI4/%u CWR set and sending delayed ACK - rcv_nxt=%u\n",
+                                 &inet->inet_daddr,
ntohs(inet->inet_sport), tp->rcv_nxt);
+                 }
                /* Else, send delayed ack. */
-               tcp_send_delayed_ack(sk);
+               tcp_send_delayed_ack(sk, th);
        }
 }

In the kernel log on the receiver, I see the following sequence of
events at a timeout:

[ 2730.145023] tcp_debug: __tcp_ack_snd_check: 10.0.0.5/916 CWR set
and sending ACK now - rcv_nxt=2317949387
[ 2730.145543] tcp_debug: __tcp_ack_snd_check: 10.0.0.5/916 CWR set
and sending ACK now - rcv_nxt=2318243331 <-- last log statement before
timeout
[ 2730.452540] tcp_debug: __tcp_ack_snd_check: 10.0.0.5/916 CWR set
and sending ACK now - rcv_nxt=2318593747
[ 2730.453137] tcp_debug: __tcp_ack_snd_check: 10.0.0.5/916 CWR set
and sending ACK now - rcv_nxt=2318813843

>From the tcpdump trace at the receiver's interface I see that the last
log statement before the timeout corresponds exactly to one CWR packet
before the unACKed CWR packet. For example, in this case, the CWR
packet to which the indicated log statement corresponds has seqNo =
2318196995 and length 46336 ==> 2318196995 + 46336 = 2318243331, which
is exactly the rcv_nxt value of the indicated log statement. And then
unACKed CWR packet arrives and it is completely missing from the log
file, there is no indication of sending a delayed ACK either. Hence my
conclusion that the __tcp_ack_snd_check() function is never invoked by
the receiver upon receiving the unACKed CWR packet.

Sorry if that was long and verbose, I just wanted to be clear on what
I had done. Please do let me know if you have any questions though.

Thanks,
-Steve


On Tue, Dec 5, 2017 at 12:04 PM, Neal Cardwell <ncardwell@...gle.com> wrote:
> On Tue, Dec 5, 2017 at 2:36 PM, Steve Ibanez <sibanez@...nford.edu> wrote:
>> Hi Neal,
>>
>> I've included a link to small trace of 13 packets which is different
>> from the screenshot I attached in my last email, but shows the same
>> sequence of events. It's a bit hard to read the tcptrace due to the
>> 300ms timeout, so I figured this was the best approach.
>>
>> slice.pcap: https://drive.google.com/open?id=1hYXbUClHGbQv1hWG1HZWDO2WYf30N6G8
>
> Thanks for the trace! Attached is a screen shot (first screen shot is
> for the arriving packets with CWR; second is after the RTO). The
> sender behavior looks reasonable. I don't see why the receiver is not
> ACKing. As you say, it does look like a receiver bug. You could try
> adding instrumentation to try to isolate why the receiver is not
> sending an ACK immediately. You might instrument __tcp_ack_snd_check()
> and tcp_send_delayed_ack() so that when the most recent incoming
> packet had cwr set they printk to log what they are deciding in this
> case. Perhaps the tcp_send_delayed_ack()  code is hitting the max_ato
> = HZ / 2 code path?
>
> neal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ