lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBN0d8Rn2aib/Akx@localhost.localdomain>
Date:   Thu, 16 Mar 2023 20:56:39 +0100
From:   Michal Kubiak <michal.kubiak@...el.com>
To:     Shay Agroskin <shayagr@...zon.com>
CC:     David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, <netdev@...r.kernel.org>,
        "Woodhouse, David" <dwmw@...zon.com>,
        "Machulsky, Zorik" <zorik@...zon.com>,
        "Matushevsky, Alexander" <matua@...zon.com>,
        "Saeed Bshara" <saeedb@...zon.com>,
        "Wilson, Matt" <msw@...zon.com>,
        "Liguori, Anthony" <aliguori@...zon.com>,
        "Bshara, Nafea" <nafea@...zon.com>,
        "Belgazal, Netanel" <netanel@...zon.com>,
        "Saidi, Ali" <alisaidi@...zon.com>,
        "Herrenschmidt, Benjamin" <benh@...zon.com>,
        "Kiyanovski, Arthur" <akiyano@...zon.com>,
        "Dagan, Noam" <ndagan@...zon.com>,
        "Arinzon, David" <darinzon@...zon.com>,
        "Itzko, Shahar" <itzko@...zon.com>,
        "Abboud, Osama" <osamaabb@...zon.com>
Subject: Re: [PATCH v5 net-next 1/6] ethtool: Add support for configuring
 tx_push_buf_len

On Thu, Mar 16, 2023 at 04:27:01PM +0200, Shay Agroskin wrote:
> This attribute, which is part of ethtool's ring param configuration
> allows the user to specify the maximum number of the packet's payload
> that can be written directly to the device.
> 
> Example usage:
>     # ethtool -G [interface] tx-push-buf-len [number of bytes]
> 
> Co-developed-by: Jakub Kicinski <kuba@...nel.org>
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> Signed-off-by: Shay Agroskin <shayagr@...zon.com>
> ---
>  Documentation/netlink/specs/ethtool.yaml     |  8 ++++
>  Documentation/networking/ethtool-netlink.rst | 47 +++++++++++++-------
>  include/linux/ethtool.h                      | 14 ++++--
>  include/uapi/linux/ethtool_netlink.h         |  2 +
>  net/ethtool/netlink.h                        |  2 +-
>  net/ethtool/rings.c                          | 33 +++++++++++++-
>  6 files changed, 83 insertions(+), 23 deletions(-)
> 
> diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
> index 18ecb7d90cbe..ea9b0cc10fdf 100644
> --- a/Documentation/netlink/specs/ethtool.yaml
> +++ b/Documentation/netlink/specs/ethtool.yaml
> @@ -165,6 +165,12 @@ attribute-sets:
>        -
>          name: rx-push
>          type: u8
> +      -
> +        name: tx-push-buf-len
> +        type: u32
> +      -
> +        name: tx-push-buf-len-max
> +        type: u32
>  
>    -
>      name: mm-stat
> @@ -311,6 +317,8 @@ operations:
>              - cqe-size
>              - tx-push
>              - rx-push
> +            - tx-push-buf-len
> +            - tx-push-buf-len-max
>        dump: *ring-get-op
>      -
>        name: rings-set
> diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
> index e1bc6186d7ea..cd0973d4ba01 100644
> --- a/Documentation/networking/ethtool-netlink.rst
> +++ b/Documentation/networking/ethtool-netlink.rst
> @@ -860,22 +860,24 @@ Request contents:
>  
>  Kernel response contents:
>  
> -  ====================================  ======  ===========================
> -  ``ETHTOOL_A_RINGS_HEADER``            nested  reply header
> -  ``ETHTOOL_A_RINGS_RX_MAX``            u32     max size of RX ring
> -  ``ETHTOOL_A_RINGS_RX_MINI_MAX``       u32     max size of RX mini ring
> -  ``ETHTOOL_A_RINGS_RX_JUMBO_MAX``      u32     max size of RX jumbo ring
> -  ``ETHTOOL_A_RINGS_TX_MAX``            u32     max size of TX ring
> -  ``ETHTOOL_A_RINGS_RX``                u32     size of RX ring
> -  ``ETHTOOL_A_RINGS_RX_MINI``           u32     size of RX mini ring
> -  ``ETHTOOL_A_RINGS_RX_JUMBO``          u32     size of RX jumbo ring
> -  ``ETHTOOL_A_RINGS_TX``                u32     size of TX ring
> -  ``ETHTOOL_A_RINGS_RX_BUF_LEN``        u32     size of buffers on the ring
> -  ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT``    u8      TCP header / data split
> -  ``ETHTOOL_A_RINGS_CQE_SIZE``          u32     Size of TX/RX CQE
> -  ``ETHTOOL_A_RINGS_TX_PUSH``           u8      flag of TX Push mode
> -  ``ETHTOOL_A_RINGS_RX_PUSH``           u8      flag of RX Push mode
> -  ====================================  ======  ===========================
> +  =======================================   ======  ===========================
> +  ``ETHTOOL_A_RINGS_HEADER``                nested  reply header
> +  ``ETHTOOL_A_RINGS_RX_MAX``                u32     max size of RX ring
> +  ``ETHTOOL_A_RINGS_RX_MINI_MAX``           u32     max size of RX mini ring
> +  ``ETHTOOL_A_RINGS_RX_JUMBO_MAX``          u32     max size of RX jumbo ring
> +  ``ETHTOOL_A_RINGS_TX_MAX``                u32     max size of TX ring
> +  ``ETHTOOL_A_RINGS_RX``                    u32     size of RX ring
> +  ``ETHTOOL_A_RINGS_RX_MINI``               u32     size of RX mini ring
> +  ``ETHTOOL_A_RINGS_RX_JUMBO``              u32     size of RX jumbo ring
> +  ``ETHTOOL_A_RINGS_TX``                    u32     size of TX ring
> +  ``ETHTOOL_A_RINGS_RX_BUF_LEN``            u32     size of buffers on the ring
> +  ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT``        u8      TCP header / data split
> +  ``ETHTOOL_A_RINGS_CQE_SIZE``              u32     Size of TX/RX CQE
> +  ``ETHTOOL_A_RINGS_TX_PUSH``               u8      flag of TX Push mode
> +  ``ETHTOOL_A_RINGS_RX_PUSH``               u8      flag of RX Push mode
> +  ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN``       u32     size of TX push buffer
> +  ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX``   u32     max size of TX push buffer
> +  =======================================   ======  ===========================
>  
>  ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
>  page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
> @@ -891,6 +893,18 @@ through MMIO writes, thus reducing the latency. However, enabling this feature
>  may increase the CPU cost. Drivers may enforce additional per-packet
>  eligibility checks (e.g. on packet size).
>  
> +``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` specifies the maximum number of bytes of a
> +transmitted packet a driver can push directly to the underlying device
> +('push' mode). Pushing some of the payload bytes to the device has the
> +advantages of reducing latency for small packets by avoiding DMA mapping (same
> +as ``ETHTOOL_A_RINGS_TX_PUSH`` parameter) as well as allowing the underlying
> +device to process packet headers ahead of fetching its payload.
> +This can help the device to make fast actions based on the packet's headers.
> +This is similar to the "tx-copybreak" parameter, which copies the packet to a
> +preallocated DMA memory area instead of mapping new memory. However,
> +tx-push-buff parameter copies the packet directly to the device to allow the
> +device to take faster actions on the packet.
> +
>  RINGS_SET
>  =========
>  
> @@ -908,6 +922,7 @@ Request contents:
>    ``ETHTOOL_A_RINGS_CQE_SIZE``          u32     Size of TX/RX CQE
>    ``ETHTOOL_A_RINGS_TX_PUSH``           u8      flag of TX Push mode
>    ``ETHTOOL_A_RINGS_RX_PUSH``           u8      flag of RX Push mode
> +  ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN``   u32     size of TX push buffer
>    ====================================  ======  ===========================
>  
>  Kernel checks that requested ring sizes do not exceed limits reported by
> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
> index 2792185dda22..798d35890118 100644
> --- a/include/linux/ethtool.h
> +++ b/include/linux/ethtool.h
> @@ -75,6 +75,8 @@ enum {
>   * @tx_push: The flag of tx push mode
>   * @rx_push: The flag of rx push mode
>   * @cqe_size: Size of TX/RX completion queue event
> + * @tx_push_buf_len: Size of TX push buffer
> + * @tx_push_buf_max_len: Maximum allowed size of TX push buffer
>   */
>  struct kernel_ethtool_ringparam {
>  	u32	rx_buf_len;
> @@ -82,6 +84,8 @@ struct kernel_ethtool_ringparam {
>  	u8	tx_push;
>  	u8	rx_push;
>  	u32	cqe_size;
> +	u32	tx_push_buf_len;
> +	u32	tx_push_buf_max_len;
>  };
>  
>  /**
> @@ -90,12 +94,14 @@ struct kernel_ethtool_ringparam {
>   * @ETHTOOL_RING_USE_CQE_SIZE: capture for setting cqe_size
>   * @ETHTOOL_RING_USE_TX_PUSH: capture for setting tx_push
>   * @ETHTOOL_RING_USE_RX_PUSH: capture for setting rx_push
> + * @ETHTOOL_RING_USE_TX_PUSH_BUF_LEN: capture for setting tx_push_buf_len
>   */
>  enum ethtool_supported_ring_param {
> -	ETHTOOL_RING_USE_RX_BUF_LEN = BIT(0),
> -	ETHTOOL_RING_USE_CQE_SIZE   = BIT(1),
> -	ETHTOOL_RING_USE_TX_PUSH    = BIT(2),
> -	ETHTOOL_RING_USE_RX_PUSH    = BIT(3),
> +	ETHTOOL_RING_USE_RX_BUF_LEN		= BIT(0),
> +	ETHTOOL_RING_USE_CQE_SIZE		= BIT(1),
> +	ETHTOOL_RING_USE_TX_PUSH		= BIT(2),
> +	ETHTOOL_RING_USE_RX_PUSH		= BIT(3),
> +	ETHTOOL_RING_USE_TX_PUSH_BUF_LEN	= BIT(4),
>  };
>  
>  #define __ETH_RSS_HASH_BIT(bit)	((u32)1 << (bit))
> diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
> index d39ce21381c5..1ebf8d455f07 100644
> --- a/include/uapi/linux/ethtool_netlink.h
> +++ b/include/uapi/linux/ethtool_netlink.h
> @@ -357,6 +357,8 @@ enum {
>  	ETHTOOL_A_RINGS_CQE_SIZE,			/* u32 */
>  	ETHTOOL_A_RINGS_TX_PUSH,			/* u8 */
>  	ETHTOOL_A_RINGS_RX_PUSH,			/* u8 */
> +	ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,		/* u32 */
> +	ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,		/* u32 */
>  
>  	/* add new constants above here */
>  	__ETHTOOL_A_RINGS_CNT,
> diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
> index f7b189ed96b2..79424b34b553 100644
> --- a/net/ethtool/netlink.h
> +++ b/net/ethtool/netlink.h
> @@ -413,7 +413,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
>  extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
>  extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
>  extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
> -extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_RX_PUSH + 1];
> +extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
>  extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
>  extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
>  extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
> diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
> index f358cd57d094..574a6f2e57af 100644
> --- a/net/ethtool/rings.c
> +++ b/net/ethtool/rings.c
> @@ -11,6 +11,7 @@ struct rings_reply_data {
>  	struct ethnl_reply_data		base;
>  	struct ethtool_ringparam	ringparam;
>  	struct kernel_ethtool_ringparam	kernel_ringparam;
> +	u32				supported_ring_params;
>  };
>  
>  #define RINGS_REPDATA(__reply_base) \
> @@ -32,6 +33,8 @@ static int rings_prepare_data(const struct ethnl_req_info *req_base,
>  
>  	if (!dev->ethtool_ops->get_ringparam)
>  		return -EOPNOTSUPP;
> +
> +	data->supported_ring_params = dev->ethtool_ops->supported_ring_params;
>  	ret = ethnl_ops_begin(dev);
>  	if (ret < 0)
>  		return ret;
> @@ -57,7 +60,9 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
>  	       nla_total_size(sizeof(u8))  +	/* _RINGS_TCP_DATA_SPLIT */
>  	       nla_total_size(sizeof(u32)  +	/* _RINGS_CQE_SIZE */
>  	       nla_total_size(sizeof(u8))  +	/* _RINGS_TX_PUSH */
> -	       nla_total_size(sizeof(u8)));	/* _RINGS_RX_PUSH */
> +	       nla_total_size(sizeof(u8))) +	/* _RINGS_RX_PUSH */
> +	       nla_total_size(sizeof(u32)) +	/* _RINGS_TX_PUSH_BUF_LEN */
> +	       nla_total_size(sizeof(u32));	/* _RINGS_TX_PUSH_BUF_LEN_MAX */
>  }
>  
>  static int rings_fill_reply(struct sk_buff *skb,
> @@ -67,6 +72,7 @@ static int rings_fill_reply(struct sk_buff *skb,
>  	const struct rings_reply_data *data = RINGS_REPDATA(reply_base);
>  	const struct kernel_ethtool_ringparam *kr = &data->kernel_ringparam;
>  	const struct ethtool_ringparam *ringparam = &data->ringparam;
> +	u32 supported_ring_params = data->supported_ring_params;
>  
>  	WARN_ON(kr->tcp_data_split > ETHTOOL_TCP_DATA_SPLIT_ENABLED);
>  
> @@ -98,7 +104,12 @@ static int rings_fill_reply(struct sk_buff *skb,
>  	    (kr->cqe_size &&
>  	     (nla_put_u32(skb, ETHTOOL_A_RINGS_CQE_SIZE, kr->cqe_size))) ||
>  	    nla_put_u8(skb, ETHTOOL_A_RINGS_TX_PUSH, !!kr->tx_push) ||
> -	    nla_put_u8(skb, ETHTOOL_A_RINGS_RX_PUSH, !!kr->rx_push))
> +	    nla_put_u8(skb, ETHTOOL_A_RINGS_RX_PUSH, !!kr->rx_push) ||
> +	    ((supported_ring_params & ETHTOOL_RING_USE_TX_PUSH_BUF_LEN) &&
> +	     (nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
> +			  kr->tx_push_buf_max_len) ||
> +	      nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
> +			  kr->tx_push_buf_len))))
>  		return -EMSGSIZE;
>  
>  	return 0;
> @@ -117,6 +128,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
>  	[ETHTOOL_A_RINGS_CQE_SIZE]		= NLA_POLICY_MIN(NLA_U32, 1),
>  	[ETHTOOL_A_RINGS_TX_PUSH]		= NLA_POLICY_MAX(NLA_U8, 1),
>  	[ETHTOOL_A_RINGS_RX_PUSH]		= NLA_POLICY_MAX(NLA_U8, 1),
> +	[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN]	= { .type = NLA_U32 },
>  };
>  
>  static int
> @@ -158,6 +170,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
>  		return -EOPNOTSUPP;
>  	}
>  
> +	if (tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] &&
> +	    !(ops->supported_ring_params & ETHTOOL_RING_USE_TX_PUSH_BUF_LEN)) {
> +		NL_SET_ERR_MSG_ATTR(info->extack,
> +				    tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN],
> +				    "setting tx push buf len is not supported");
> +		return -EOPNOTSUPP;
> +	}
> +
>  	return ops->get_ringparam && ops->set_ringparam ? 1 : -EOPNOTSUPP;
>  }
>  
> @@ -189,6 +209,8 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
>  			tb[ETHTOOL_A_RINGS_TX_PUSH], &mod);
>  	ethnl_update_u8(&kernel_ringparam.rx_push,
>  			tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
> +	ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
> +			 tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
>  	if (!mod)
>  		return 0;
>  
> @@ -209,6 +231,13 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
>  		return -EINVAL;
>  	}
>  
> +	if (kernel_ringparam.tx_push_buf_len > kernel_ringparam.tx_push_buf_max_len) {
> +		NL_SET_ERR_MSG_ATTR(info->extack, tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN],
> +				    "Requested TX push buffer exceeds maximum");

Would it make sense to display this maximum value (tx_push_buf_max_len)
in the message to the user (for convenience)?

Thanks,
Michal

> +
> +		return -EINVAL;
> +	}
> +
>  	ret = dev->ethtool_ops->set_ringparam(dev, &ringparam,
>  					      &kernel_ringparam, info->extack);
>  	return ret < 0 ? ret : 1;
> -- 
> 2.25.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ