netdev - Re: [RFC net-next] tcp: allow larger TSO to be built under overload

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iL0XWF8aavPFnTrRazV9T5fZtn3xJXrEb07HTdrM=rykw@mail.gmail.com>
Date:   Tue, 8 Mar 2022 11:53:38 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     David Laight <David.Laight@...lab.com>
Cc:     Jakub Kicinski <kuba@...nel.org>, netdev <netdev@...r.kernel.org>,
        Willem de Bruijn <willemb@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [RFC net-next] tcp: allow larger TSO to be built under overload

On Tue, Mar 8, 2022 at 1:08 AM David Laight <David.Laight@...lab.com> wrote:
>
> From: Eric Dumazet
> > Sent: 08 March 2022 03:50
> ...
> >         /* Goal is to send at least one packet per ms,
> >          * not one big TSO packet every 100 ms.
> >          * This preserves ACK clocking and is consistent
> >          * with tcp_tso_should_defer() heuristic.
> >          */
> > -       segs = max_t(u32, bytes / mss_now, min_tso_segs);
> > -
> > -       return segs;
> > +       return max_t(u32, bytes / mss_now, min_tso_segs);
> >  }
>
> Which is the common side of that max_t() ?
> If it is mon_tso_segs it might be worth avoiding the
> divide by coding as:
>
>         return bytes > mss_now * min_tso_segs ? bytes / mss_now : min_tso_segs;
>

I think the common case is when the divide must happen.
Not sure if this really matters with current cpus.

Jakub, Neal, I am going to send a patch for net-next.

In conjunction with BIG TCP, this gives a considerable boost of performance.


Before:
otrv5:/home/google/edumazet# nstat -n;./super_netperf 600 -H otrv6 -l
20 -- -K dctcp -q 20000000;nstat|egrep
"TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered"
  96005
TcpInSegs                       15649381           0.0
TcpOutSegs                      58659574           0.0  # Average of
3.74 4K segments per TSO packet
TcpExtTCPDelivered              58655240           0.0
TcpExtTCPDeliveredCE            21                 0.0

After:
otrv5:/home/google/edumazet# nstat -n;./super_netperf 600 -H otrv6 -l
20 -- -K dctcp -q 20000000;nstat|egrep
"TcpInSegs|TcpOutSegs|TcpRetransSegs|Delivered"
  96046
TcpInSegs                       1445864            0.0
TcpOutSegs                      58885065           0.0   # Average of
40.72 4K segments per TSO packet
TcpExtTCPDelivered              58880873           0.0
TcpExtTCPDeliveredCE            28                 0.0

-> 1,445,864 ACK packets instead of 15,649,381
And about 25 % of cpu cycles saved, according to perf stat

 Performance counter stats for './super_netperf 600 -H otrv6 -l 20 --
-K dctcp -q 20000000':

         66,895.00 msec task-clock                #    2.886 CPUs
utilized
         1,312,687      context-switches          # 19623.389 M/sec
             5,645      cpu-migrations            #   84.387 M/sec
           942,412      page-faults               # 14088.139 M/sec
   203,672,224,410      cycles                    # 3044700.936 GHz
               (83.40%)
    18,933,350,691      stalled-cycles-frontend   #    9.30% frontend
cycles idle     (83.46%)
   138,500,001,318      stalled-cycles-backend    #   68.00% backend
cycles idle      (83.38%)
    53,694,300,814      instructions              #    0.26  insn per
cycle
                                                  #    2.58  stalled
cycles per insn  (83.30%)
     9,100,155,390      branches                  # 136038439.770
M/sec               (83.26%)
       152,331,123      branch-misses             #    1.67% of all
branches          (83.47%)

      23.180309488 seconds time elapsed

-->

 Performance counter stats for './super_netperf 600 -H otrv6 -l 20 --
-K dctcp -q 20000000':

         48,964.30 msec task-clock                #    2.103 CPUs
utilized
           184,903      context-switches          # 3776.305 M/sec
             3,057      cpu-migrations            #   62.434 M/sec
           940,615      page-faults               # 19210.338 M/sec
   152,390,738,065      cycles                    # 3112301.652 GHz
               (83.61%)
    11,603,675,527      stalled-cycles-frontend   #    7.61% frontend
cycles idle     (83.49%)
   120,240,493,440      stalled-cycles-backend    #   78.90% backend
cycles idle      (83.30%)
    37,106,498,492      instructions              #    0.24  insn per
cycle
                                                  #    3.24  stalled
cycles per insn  (83.47%)
     5,968,256,846      branches                  # 121890712.483
M/sec               (83.25%)
        88,743,145      branch-misses             #    1.49% of all
branches          (83.24%)

      23.284583305 seconds time elapsed