[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKKS5j8tttrnky4Pmqk31dRMQ4t-jCP=GeU3JYQoRdxtg@mail.gmail.com>
Date: Thu, 26 Oct 2023 12:12:11 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Coco Li <lixiaoyan@...gle.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Neal Cardwell <ncardwell@...gle.com>,
Mubashir Adnan Qureshi <mubashirq@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>,
Jonathan Corbet <corbet@....net>, David Ahern <dsahern@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>,
Wei Wang <weiwan@...gle.com>, Pradeep Nemavat <pnemavat@...gle.com>
Subject: Re: [PATCH v4 net-next 6/6] tcp: reorganize tcp_sock fast path variables
On Thu, Oct 26, 2023 at 10:20 AM Coco Li <lixiaoyan@...gle.com> wrote:
>
> The variables are organized according in the following way:
>
> - TX read-mostly hotpath cache lines
> - TXRX read-mostly hotpath cache lines
> - RX read-mostly hotpath cache lines
> - TX read-write hotpath cache line
> - TXRX read-write hotpath cache line
> - RX read-write hotpath cache line
>
> Fastpath cachelines end after rcvq_space.
>
> Cache line boundaries are enfored only between read-mostly and
> read-write. That is, if read-mostly tx cachelines bleed into
> read-mostly txrx cachelines, we do not care. We care about the
> boundaries between read and write cachelines because we want
> to prevent false sharing.
>
> Fast path variables span cache lines before change: 12
> Fast path variables span cache lines after change: 8
>
> Signed-off-by: Coco Li <lixiaoyan@...gle.com>
> Suggested-by: Eric Dumazet <edumazet@...gle.com>
> Reviewed-by: Wei Wang <weiwan@...gle.com>
> Reviewed-by: David Ahern <dsahern@...nel.org>
> ---
> include/linux/tcp.h | 240 +++++++++++++++++++++++---------------------
> net/ipv4/tcp.c | 85 ++++++++++++++++
> 2 files changed, 211 insertions(+), 114 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 6df715b6e51d4..67b00ee0248f8 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -176,23 +176,113 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
> #define TCP_RMEM_TO_WIN_SCALE 8
>
> struct tcp_sock {
> + /* Cacheline organization can be found documented in
> + * Documentation/networking/net_cachelines/tcp_sock.rst.
> + * Please update the document when adding new fields.
> + */
> +
> /* inet_connection_sock has to be the first member of tcp_sock */
> struct inet_connection_sock inet_conn;
> - u16 tcp_header_len; /* Bytes of tcp header to send */
> +
> + __cacheline_group_begin(tcp_sock_read);
Same remarks here.
In __cacheline_group_begin(NAME), NAME should reflect the intent,
which you documented as "TX read-mostly hotpath cache lines *
NAME should therefore be tx_hotpath or something similar.
> + /* TX read-mostly hotpath cache lines */
> + /* timestamp of last sent data packet (for restart window) */
> + u32 max_window; /* Maximal window ever seen from peer */
> + u32 rcv_ssthresh; /* Current window clamp */
> + u32 reordering; /* Packet reordering metric. */
> + u32 notsent_lowat; /* TCP_NOTSENT_LOWAT */
> u16 gso_segs; /* Max number of segs per GSO packet */
> + /* from STCP, retrans queue hinting */
> + struct sk_buff *lost_skb_hint;
> + struct sk_buff *retransmit_skb_hint;
> +
> + /* TXRX read-mostly hotpath cache lines */
> + u32 tsoffset; /* timestamp offset */
> + u32 snd_wnd; /* The window we expect to receive */
> + u32 mss_cache; /* Cached effective mss, not including SACKS */
> + u32 snd_cwnd; /* Sending congestion window */
> + u32 prr_out; /* Total number of pkts sent during Recovery. */
> + u32 lost_out; /* Lost packets */
> + u32 sacked_out; /* SACK'd packets */
> + u16 tcp_header_len; /* Bytes of tcp header to send */
> + u8 chrono_type : 2, /* current chronograph type */
> + repair : 1,
> + is_sack_reneg:1, /* in recovery from loss with SACK reneg? */
> + is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
> +
And of course, prior group should end here, and a new group should begin.
We identified 6 groups, so please use 6 groups :)
Powered by blists - more mailing lists