[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+qYjt45-qO11vu_v=TrK7tn-C=iA5q7bw9YbK-qe5KZA@mail.gmail.com>
Date: Thu, 21 Aug 2025 05:46:49 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: chia-yu.chang@...ia-bell-labs.com
Cc: pabeni@...hat.com, linux-doc@...r.kernel.org, corbet@....net,
horms@...nel.org, dsahern@...nel.org, kuniyu@...zon.com, bpf@...r.kernel.org,
netdev@...r.kernel.org, dave.taht@...il.com, jhs@...atatu.com,
kuba@...nel.org, stephen@...workplumber.org, xiyou.wangcong@...il.com,
jiri@...nulli.us, davem@...emloft.net, andrew+netdev@...n.ch,
donald.hunter@...il.com, ast@...erby.net, liuhangbin@...il.com,
shuah@...nel.org, linux-kselftest@...r.kernel.org, ij@...nel.org,
ncardwell@...gle.com, koen.de_schepper@...ia-bell-labs.com,
g.white@...lelabs.com, ingemar.s.johansson@...csson.com,
mirja.kuehlewind@...csson.com, cheshire@...le.com, rs.ietf@....at,
Jason_Livingood@...cast.com, vidhi_goel@...le.com
Subject: Re: [PATCH v15 net-next 11/14] tcp: accecn: AccECN option send control
On Fri, Aug 15, 2025 at 1:40 AM <chia-yu.chang@...ia-bell-labs.com> wrote:
>
> From: Chia-Yu Chang <chia-yu.chang@...ia-bell-labs.com>
>
> Instead of sending the option in every ACK, limit sending to
> those ACKs where the option is necessary:
> - Handshake
> - "Change-triggered ACK" + the ACK following it. The
> 2nd ACK is necessary to unambiguously indicate which
> of the ECN byte counters in increasing. The first
> ACK has two counters increasing due to the ecnfield
> edge.
> - ACKs with CE to allow CEP delta validations to take
> advantage of the option.
> - Force option to be sent every at least once per 2^22
> bytes. The check is done using the bit edges of the
> byte counters (avoids need for extra variables).
> - AccECN option beacon to send a few times per RTT even if
> nothing in the ECN state requires that. The default is 3
> times per RTT, and its period can be set via
> sysctl_tcp_ecn_option_beacon.
>
> Below are the pahole outcomes before and after this patch,
> in which the group size of tcp_sock_write_tx is increased
> from 89 to 97 due to the new u64 accecn_opt_tstamp member:
>
> [BEFORE THIS PATCH]
> struct tcp_sock {
> [...]
> u64 tcp_wstamp_ns; /* 2488 8 */
> struct list_head tsorted_sent_queue; /* 2496 16 */
>
> [...]
> __cacheline_group_end__tcp_sock_write_tx[0]; /* 2521 0 */
> __cacheline_group_begin__tcp_sock_write_txrx[0]; /* 2521 0 */
> u8 nonagle:4; /* 2521: 0 1 */
> u8 rate_app_limited:1; /* 2521: 4 1 */
> /* XXX 3 bits hole, try to pack */
>
> /* Force alignment to the next boundary: */
> u8 :0;
> u8 received_ce_pending:4;/* 2522: 0 1 */
> u8 unused2:4; /* 2522: 4 1 */
> u8 accecn_minlen:2; /* 2523: 0 1 */
> u8 est_ecnfield:2; /* 2523: 2 1 */
> u8 unused3:4; /* 2523: 4 1 */
>
> [...]
> __cacheline_group_end__tcp_sock_write_txrx[0]; /* 2628 0 */
>
> [...]
> /* size: 3200, cachelines: 50, members: 171 */
> }
>
> [AFTER THIS PATCH]
> struct tcp_sock {
> [...]
> u64 tcp_wstamp_ns; /* 2488 8 */
> u64 accecn_opt_tstamp; /* 2596 8 */
> struct list_head tsorted_sent_queue; /* 2504 16 */
>
> [...]
> __cacheline_group_end__tcp_sock_write_tx[0]; /* 2529 0 */
> __cacheline_group_begin__tcp_sock_write_txrx[0]; /* 2529 0 */
> u8 nonagle:4; /* 2529: 0 1 */
> u8 rate_app_limited:1; /* 2529: 4 1 */
> /* XXX 3 bits hole, try to pack */
>
> /* Force alignment to the next boundary: */
> u8 :0;
> u8 received_ce_pending:4;/* 2530: 0 1 */
> u8 unused2:4; /* 2530: 4 1 */
> u8 accecn_minlen:2; /* 2531: 0 1 */
> u8 est_ecnfield:2; /* 2531: 2 1 */
> u8 accecn_opt_demand:2; /* 2531: 4 1 */
> u8 prev_ecnfield:2; /* 2531: 6 1 */
>
> [...]
> __cacheline_group_end__tcp_sock_write_txrx[0]; /* 2636 0 */
>
> [...]
> /* size: 3200, cachelines: 50, members: 173 */
> }
>
> Signed-off-by: Chia-Yu Chang <chia-yu.chang@...ia-bell-labs.com>
> Co-developed-by: Ilpo Järvinen <ij@...nel.org>
> Signed-off-by: Ilpo Järvinen <ij@...nel.org>
>
Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Powered by blists - more mailing lists