lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKKS5j8tttrnky4Pmqk31dRMQ4t-jCP=GeU3JYQoRdxtg@mail.gmail.com>
Date: Thu, 26 Oct 2023 12:12:11 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Coco Li <lixiaoyan@...gle.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Neal Cardwell <ncardwell@...gle.com>, 
	Mubashir Adnan Qureshi <mubashirq@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>, 
	Jonathan Corbet <corbet@....net>, David Ahern <dsahern@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>, 
	Wei Wang <weiwan@...gle.com>, Pradeep Nemavat <pnemavat@...gle.com>
Subject: Re: [PATCH v4 net-next 6/6] tcp: reorganize tcp_sock fast path variables

On Thu, Oct 26, 2023 at 10:20 AM Coco Li <lixiaoyan@...gle.com> wrote:
>
> The variables are organized according in the following way:
>
> - TX read-mostly hotpath cache lines
> - TXRX read-mostly hotpath cache lines
> - RX read-mostly hotpath cache lines
> - TX read-write hotpath cache line
> - TXRX read-write hotpath cache line
> - RX read-write hotpath cache line
>
> Fastpath cachelines end after rcvq_space.
>
> Cache line boundaries are enfored only between read-mostly and
> read-write. That is, if read-mostly tx cachelines bleed into
> read-mostly txrx cachelines, we do not care. We care about the
> boundaries between read and write cachelines because we want
> to prevent false sharing.
>
> Fast path variables span cache lines before change: 12
> Fast path variables span cache lines after change: 8
>
> Signed-off-by: Coco Li <lixiaoyan@...gle.com>
> Suggested-by: Eric Dumazet <edumazet@...gle.com>
> Reviewed-by: Wei Wang <weiwan@...gle.com>
> Reviewed-by: David Ahern <dsahern@...nel.org>
> ---
>  include/linux/tcp.h | 240 +++++++++++++++++++++++---------------------
>  net/ipv4/tcp.c      |  85 ++++++++++++++++
>  2 files changed, 211 insertions(+), 114 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 6df715b6e51d4..67b00ee0248f8 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -176,23 +176,113 @@ static inline struct tcp_request_sock *tcp_rsk(const struct request_sock *req)
>  #define TCP_RMEM_TO_WIN_SCALE 8
>
>  struct tcp_sock {
> +       /* Cacheline organization can be found documented in
> +        * Documentation/networking/net_cachelines/tcp_sock.rst.
> +        * Please update the document when adding new fields.
> +        */
> +
>         /* inet_connection_sock has to be the first member of tcp_sock */
>         struct inet_connection_sock     inet_conn;
> -       u16     tcp_header_len; /* Bytes of tcp header to send          */
> +
> +       __cacheline_group_begin(tcp_sock_read);

Same remarks here.
In __cacheline_group_begin(NAME), NAME should reflect the intent,
which you documented as "TX read-mostly hotpath cache lines *

NAME should therefore be tx_hotpath or something similar.

> +       /* TX read-mostly hotpath cache lines */
> +       /* timestamp of last sent data packet (for restart window) */
> +       u32     max_window;     /* Maximal window ever seen from peer   */
> +       u32     rcv_ssthresh;   /* Current window clamp                 */
> +       u32     reordering;     /* Packet reordering metric.            */
> +       u32     notsent_lowat;  /* TCP_NOTSENT_LOWAT */
>         u16     gso_segs;       /* Max number of segs per GSO packet    */
> +       /* from STCP, retrans queue hinting */
> +       struct sk_buff *lost_skb_hint;
> +       struct sk_buff *retransmit_skb_hint;
> +
> +       /* TXRX read-mostly hotpath cache lines */
> +       u32     tsoffset;       /* timestamp offset */
> +       u32     snd_wnd;        /* The window we expect to receive      */
> +       u32     mss_cache;      /* Cached effective mss, not including SACKS */
> +       u32     snd_cwnd;       /* Sending congestion window            */
> +       u32     prr_out;        /* Total number of pkts sent during Recovery. */
> +       u32     lost_out;       /* Lost packets                 */
> +       u32     sacked_out;     /* SACK'd packets                       */
> +       u16     tcp_header_len; /* Bytes of tcp header to send          */
> +       u8      chrono_type : 2,        /* current chronograph type */
> +               repair      : 1,
> +               is_sack_reneg:1,    /* in recovery from loss with SACK reneg? */
> +               is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
> +

And of course, prior group should end here, and a new group should begin.


We identified 6 groups, so please use 6 groups :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ