[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<PAXPR07MB798470401703ACBD26148633A3BB2@PAXPR07MB7984.eurprd07.prod.outlook.com>
Date: Tue, 22 Apr 2025 15:48:56 +0000
From: "Chia-Yu Chang (Nokia)" <chia-yu.chang@...ia-bell-labs.com>
To: Simon Horman <horms@...nel.org>
CC: "dsahern@...nel.org" <dsahern@...nel.org>, "kuniyu@...zon.com"
<kuniyu@...zon.com>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "dave.taht@...il.com"
<dave.taht@...il.com>, "pabeni@...hat.com" <pabeni@...hat.com>,
"jhs@...atatu.com" <jhs@...atatu.com>, "kuba@...nel.org" <kuba@...nel.org>,
"stephen@...workplumber.org" <stephen@...workplumber.org>,
"xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>, "jiri@...nulli.us"
<jiri@...nulli.us>, "davem@...emloft.net" <davem@...emloft.net>,
"edumazet@...gle.com" <edumazet@...gle.com>, "andrew+netdev@...n.ch"
<andrew+netdev@...n.ch>, "donald.hunter@...il.com" <donald.hunter@...il.com>,
"ast@...erby.net" <ast@...erby.net>, "liuhangbin@...il.com"
<liuhangbin@...il.com>, "shuah@...nel.org" <shuah@...nel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
"ij@...nel.org" <ij@...nel.org>, "ncardwell@...gle.com"
<ncardwell@...gle.com>, "Koen De Schepper (Nokia)"
<koen.de_schepper@...ia-bell-labs.com>, g.white <g.white@...lelabs.com>,
"ingemar.s.johansson@...csson.com" <ingemar.s.johansson@...csson.com>,
"mirja.kuehlewind@...csson.com" <mirja.kuehlewind@...csson.com>,
"cheshire@...le.com" <cheshire@...le.com>, "rs.ietf@....at" <rs.ietf@....at>,
"Jason_Livingood@...cast.com" <Jason_Livingood@...cast.com>, vidhi_goel
<vidhi_goel@...le.com>
Subject: RE: [PATCH v4 net-next 10/15] tcp: accecn: AccECN option send control
> -----Original Message-----
> From: Simon Horman <horms@...nel.org>
> Sent: Friday, April 18, 2025 8:25 PM
> To: Chia-Yu Chang (Nokia) <chia-yu.chang@...ia-bell-labs.com>
> Cc: dsahern@...nel.org; kuniyu@...zon.com; bpf@...r.kernel.org; netdev@...r.kernel.org; dave.taht@...il.com; pabeni@...hat.com; jhs@...atatu.com; kuba@...nel.org; stephen@...workplumber.org; xiyou.wangcong@...il.com; jiri@...nulli.us; davem@...emloft.net; edumazet@...gle.com; andrew+netdev@...n.ch; donald.hunter@...il.com; ast@...erby.net; liuhangbin@...il.com; shuah@...nel.org; linux-kselftest@...r.kernel.org; ij@...nel.org; ncardwell@...gle.com; Koen De Schepper (Nokia) <koen.de_schepper@...ia-bell-labs.com>; g.white <g.white@...lelabs.com>; ingemar.s.johansson@...csson.com; mirja.kuehlewind@...csson.com; cheshire@...le.com; rs.ietf@....at; Jason_Livingood@...cast.com; vidhi_goel <vidhi_goel@...le.com>
> Subject: Re: [PATCH v4 net-next 10/15] tcp: accecn: AccECN option send control
>
>
> CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information.
>
>
>
> On Fri, Apr 18, 2025 at 06:34:07PM +0100, Simon Horman wrote:
> > On Fri, Apr 18, 2025 at 01:00:24AM +0200, chia-yu.chang@...ia-bell-labs.com wrote:
> > > From: Ilpo Järvinen <ij@...nel.org>
> > >
> > > Instead of sending the option in every ACK, limit sending to those
> > > ACKs where the option is necessary:
> > > - Handshake
> > > - "Change-triggered ACK" + the ACK following it. The
> > > 2nd ACK is necessary to unambiguously indicate which
> > > of the ECN byte counters in increasing. The first
> > > ACK has two counters increasing due to the ecnfield
> > > edge.
> > > - ACKs with CE to allow CEP delta validations to take
> > > advantage of the option.
> > > - Force option to be sent every at least once per 2^22
> > > bytes. The check is done using the bit edges of the
> > > byte counters (avoids need for extra variables).
> > > - AccECN option beacon to send a few times per RTT even if
> > > nothing in the ECN state requires that. The default is 3
> > > times per RTT, and its period can be set via
> > > sysctl_tcp_ecn_option_beacon.
> > >
> > > Signed-off-by: Ilpo Järvinen <ij@...nel.org>
> > > Co-developed-by: Chia-Yu Chang <chia-yu.chang@...ia-bell-labs.com>
> > > Signed-off-by: Chia-Yu Chang <chia-yu.chang@...ia-bell-labs.com>
> > > ---
> > > include/linux/tcp.h | 3 +++
> > > include/net/netns/ipv4.h | 1 +
> > > include/net/tcp.h | 1 +
> > > net/ipv4/sysctl_net_ipv4.c | 9 ++++++++
> > > net/ipv4/tcp.c | 5 ++++-
> > > net/ipv4/tcp_input.c | 36 +++++++++++++++++++++++++++++++-
> > > net/ipv4/tcp_ipv4.c | 1 +
> > > net/ipv4/tcp_minisocks.c | 2 ++
> > > net/ipv4/tcp_output.c | 42 ++++++++++++++++++++++++++++++--------
> > > 9 files changed, 90 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/include/linux/tcp.h b/include/linux/tcp.h index
> > > 0e032d9631ac..9619524d8901 100644
> > > --- a/include/linux/tcp.h
> > > +++ b/include/linux/tcp.h
> > > @@ -309,7 +309,10 @@ struct tcp_sock {
> > > u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */
> > > unused2:4;
> > > u8 accecn_minlen:2,/* Minimum length of AccECN option sent */
> > > + prev_ecnfield:2,/* ECN bits from the previous segment */
> > > + accecn_opt_demand:2,/* Demand AccECN option for n next
> > > + ACKs */
> > > est_ecnfield:2;/* ECN field for AccECN delivered
> > > estimates */
> > > + u64 accecn_opt_tstamp; /* Last AccECN option sent timestamp */
> > > u32 app_limited; /* limited until "delivered" reaches this val */
> > > u32 rcv_wnd; /* Current receiver window */
> > > /*
> >
> > ...
> >
> > > @@ -5113,7 +5116,7 @@ static void __init tcp_struct_check(void)
> > > /* 32bit arches with 8byte alignment on u64 fields might need padding
> > > * before tcp_clock_cache.
> > > */
> > > - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 122 + 6);
> > > + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock,
> > > + tcp_sock_write_txrx, 130 + 6);
> >
> > Hi,
> >
> > While this seems find on x86_64, x86_32 and arm64, it does not seem
> > correct on ARM (32-bit).
> >
> > This is because the existing two byte hole after est_ecnfield grows to
> > 6 bytes. I assume this is because of alignment requirements.
> > But in any case, the result is that the overall group size increases
> > by 12 bytes rather than 8.
> >
> > I believe that you can avoid the hole growing, and thus limit the
> > overall increase in size of the group to 12 bytes, by placing
> > accecn_opt_tstamp elsewhere, e.g. after app_limited rather than before it.
> >
> > You can exercise this by cross compiling for ARM and examining the
> > structure layout using pahole.
> >
> > Cross compilers available from [1] should be able to do that something
> > like this:
> >
> > PATH=".../gcc-14.2.0-nolibc/arm-linux-gnueabi/bin:$PATH"
> > export ARCH=arm
> > export CROSS_COMPILE=arm-linux-gnueabi-
> >
> > make allmodconfig
> > echo "CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y" >> .config yes "" |
> > make oldconfig
> >
> > make net/ipv4/tcp.o
>
> Sorry, I omitted the invocation of pahole.
>
> pahole net/ipv4/tcp.o
>
> > [1]
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmirr
> > ors.edge.kernel.org%2Fpub%2Ftools%2Fcrosstool%2Ffiles%2Fbin%2Fx86_64%2
> > F14.2.0%2F&data=05%7C02%7Cchia-yu.chang%40nokia-bell-labs.com%7Cd00e29
> > 455c5a42475ae408dd7ea658e2%7C5d4717519675428d917b70f44f9630b0%7C0%7C0%
> > 7C638805975110989177%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsI
> > lYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C
> > 0%7C%7C%7C&sdata=jYRGDvf%2Bx7lGuSxtgGfredGRNZYDdmlxpiUnwQr4ICQ%3D&rese
> > rved=0
>
> --
> pw-bot: changes-requested
Hi Simon,
Thanks for the tip, I just fixed it based on pahole results and resubmit v5.
BRs,
Chia-Yu
Powered by blists - more mailing lists