[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241003160620.1521626-4-ap420073@gmail.com>
Date: Thu, 3 Oct 2024 16:06:16 +0000
From: Taehee Yoo <ap420073@...il.com>
To: davem@...emloft.net,
kuba@...nel.org,
pabeni@...hat.com,
edumazet@...gle.com,
almasrymina@...gle.com,
netdev@...r.kernel.org,
linux-doc@...r.kernel.org,
donald.hunter@...il.com,
corbet@....net,
michael.chan@...adcom.com
Cc: kory.maincent@...tlin.com,
andrew@...n.ch,
maxime.chevallier@...tlin.com,
danieller@...dia.com,
hengqi@...ux.alibaba.com,
ecree.xilinx@...il.com,
przemyslaw.kitszel@...el.com,
hkallweit1@...il.com,
ahmed.zaki@...el.com,
paul.greenwalt@...el.com,
rrameshbabu@...dia.com,
idosch@...dia.com,
asml.silence@...il.com,
kaiyuanz@...gle.com,
willemb@...gle.com,
aleksander.lobakin@...el.com,
dw@...idwei.uk,
sridhar.samudrala@...el.com,
bcreeley@....com,
ap420073@...il.com
Subject: [PATCH net-next v3 3/7] net: ethtool: add support for configuring tcp-data-split-thresh
The tcp-data-split-thresh option configures the threshold value of
the tcp-data-split.
If a received packet size is larger than this threshold value, a packet
will be split into header and payload.
The header indicates TCP header, but it depends on driver spec.
The bnxt_en driver supports HDS(Header-Data-Split) configuration at
FW level, affecting TCP and UDP too.
So, like the tcp-data-split option, If tcp-data-split-thresh is set,
it affects UDP and TCP packets.
The tcp-data-split-thresh has a dependency, that is tcp-data-split
option. This threshold value can be get/set only when tcp-data-split
option is enabled.
Example:
# ethtool -G <interface name> tcp-data-split-thresh <value>
# ethtool -G enp14s0f0np0 tcp-data-split on tcp-data-split-thresh 256
# ethtool -g enp14s0f0np0
Ring parameters for enp14s0f0np0:
Pre-set maximums:
...
TCP data split thresh: 256
Current hardware settings:
...
TCP data split: on
TCP data split thresh: 256
The tcp-data-split is not enabled, the tcp-data-split-thresh will
not be used and can't be configured.
# ethtool -G enp14s0f0np0 tcp-data-split off
# ethtool -g enp14s0f0np0
Ring parameters for enp14s0f0np0:
Pre-set maximums:
...
TCP data split thresh: 256
Current hardware settings:
...
TCP data split: off
TCP data split thresh: n/a
The default/min/max values are not defined in the ethtool so the drivers
should define themself.
The 0 value means that all TCP and UDP packets' header and payload
will be split.
Users should consider the overhead due to this feature.
Signed-off-by: Taehee Yoo <ap420073@...il.com>
---
v3:
- Fix documentation and ynl
- Update error messages
- Validate configuration of tcp-data-split and tcp-data-split-thresh
v2:
- Patch added.
Documentation/netlink/specs/ethtool.yaml | 8 +++
Documentation/networking/ethtool-netlink.rst | 75 ++++++++++++--------
include/linux/ethtool.h | 4 ++
include/uapi/linux/ethtool_netlink.h | 2 +
net/ethtool/netlink.h | 2 +-
net/ethtool/rings.c | 46 ++++++++++--
6 files changed, 102 insertions(+), 35 deletions(-)
diff --git a/Documentation/netlink/specs/ethtool.yaml b/Documentation/netlink/specs/ethtool.yaml
index 6a050d755b9c..96298fe5ed43 100644
--- a/Documentation/netlink/specs/ethtool.yaml
+++ b/Documentation/netlink/specs/ethtool.yaml
@@ -215,6 +215,12 @@ attribute-sets:
-
name: tx-push-buf-len-max
type: u32
+ -
+ name: tcp-data-split-thresh
+ type: u32
+ -
+ name: tcp-data-split-thresh-max
+ type: u32
-
name: mm-stat
@@ -1393,6 +1399,8 @@ operations:
- rx-push
- tx-push-buf-len
- tx-push-buf-len-max
+ - tcp-data-split-thresh
+ - tcp-data-split-thresh-max
dump: *ring-get-op
-
name: rings-set
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 295563e91082..f0cd918dbe7e 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -875,24 +875,32 @@ Request contents:
Kernel response contents:
- ======================================= ====== ===========================
- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
- ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
- ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push buffer
- ======================================= ====== ===========================
+ ============================================= ====== =======================
+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
+ ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini
+ ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO_MAX`` u32 max size of RX jumbo
+ ring
+ ``ETHTOOL_A_RINGS_TX_MAX`` u32 max size of TX ring
+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the
+ ring
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX`` u32 max size of TX push
+ buffer
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` u32 threshold of
+ TCP header / data split
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX`` u32 max threshold of
+ TCP header / data split
+ ============================================= ====== =======================
``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` indicates whether the device is usable with
page-flipping TCP zero-copy receive (``getsockopt(TCP_ZEROCOPY_RECEIVE)``).
@@ -927,18 +935,21 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request.
Request contents:
- ==================================== ====== ===========================
- ``ETHTOOL_A_RINGS_HEADER`` nested reply header
- ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
- ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
- ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
- ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
- ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
- ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
- ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
- ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
- ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
- ==================================== ====== ===========================
+ ========================================= ====== =======================
+ ``ETHTOOL_A_RINGS_HEADER`` nested reply header
+ ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring
+ ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring
+ ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring
+ ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring
+ ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT`` u8 TCP header / data split
+ ``ETHTOOL_A_RINGS_CQE_SIZE`` u32 Size of TX/RX CQE
+ ``ETHTOOL_A_RINGS_TX_PUSH`` u8 flag of TX Push mode
+ ``ETHTOOL_A_RINGS_RX_PUSH`` u8 flag of RX Push mode
+ ``ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN`` u32 size of TX push buffer
+ ``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` u32 threshold of
+ TCP header / data split
+ ========================================= ====== =======================
Kernel checks that requested ring sizes do not exceed limits reported by
driver. Driver may impose additional constraints and may not support all
@@ -954,6 +965,10 @@ A bigger CQE can have more receive buffer pointers, and in turn the NIC can
transfer a bigger frame from wire. Based on the NIC hardware, the overall
completion queue size can be adjusted in the driver if CQE size is modified.
+``ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH`` specifies the threshold value of
+tcp data split feature. If tcp-data-split is enabled and a received packet
+size is larger than this threshold value, header and data will be split.
+
CHANNELS_GET
============
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 12f6dc567598..891f55b0f6aa 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -78,6 +78,8 @@ enum {
* @cqe_size: Size of TX/RX completion queue event
* @tx_push_buf_len: Size of TX push buffer
* @tx_push_buf_max_len: Maximum allowed size of TX push buffer
+ * @tcp_data_split_thresh: Threshold value of tcp-data-split
+ * @tcp_data_split_thresh_max: Maximum allowed threshold of tcp-data-split-threshold
*/
struct kernel_ethtool_ringparam {
u32 rx_buf_len;
@@ -87,6 +89,8 @@ struct kernel_ethtool_ringparam {
u32 cqe_size;
u32 tx_push_buf_len;
u32 tx_push_buf_max_len;
+ u32 tcp_data_split_thresh;
+ u32 tcp_data_split_thresh_max;
};
/**
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 283305f6b063..20fe6065b7ba 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -364,6 +364,8 @@ enum {
ETHTOOL_A_RINGS_RX_PUSH, /* u8 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN, /* u32 */
ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX, /* u32 */
+ ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH, /* u32 */
+ ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX, /* u32 */
/* add new constants above here */
__ETHTOOL_A_RINGS_CNT,
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 203b08eb6c6f..8bea47a26605 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -455,7 +455,7 @@ extern const struct nla_policy ethnl_features_set_policy[ETHTOOL_A_FEATURES_WANT
extern const struct nla_policy ethnl_privflags_get_policy[ETHTOOL_A_PRIVFLAGS_HEADER + 1];
extern const struct nla_policy ethnl_privflags_set_policy[ETHTOOL_A_PRIVFLAGS_FLAGS + 1];
extern const struct nla_policy ethnl_rings_get_policy[ETHTOOL_A_RINGS_HEADER + 1];
-extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX + 1];
+extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX + 1];
extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
diff --git a/net/ethtool/rings.c b/net/ethtool/rings.c
index b7865a14fdf8..c7824515857f 100644
--- a/net/ethtool/rings.c
+++ b/net/ethtool/rings.c
@@ -61,7 +61,9 @@ static int rings_reply_size(const struct ethnl_req_info *req_base,
nla_total_size(sizeof(u8)) + /* _RINGS_TX_PUSH */
nla_total_size(sizeof(u8))) + /* _RINGS_RX_PUSH */
nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN */
- nla_total_size(sizeof(u32)); /* _RINGS_TX_PUSH_BUF_LEN_MAX */
+ nla_total_size(sizeof(u32)) + /* _RINGS_TX_PUSH_BUF_LEN_MAX */
+ nla_total_size(sizeof(u32)) + /* _RINGS_TCP_DATA_SPLIT_THRESH */
+ nla_total_size(sizeof(u32)); /* _RINGS_TCP_DATA_SPLIT_THRESH_MAX */
}
static int rings_fill_reply(struct sk_buff *skb,
@@ -108,7 +110,13 @@ static int rings_fill_reply(struct sk_buff *skb,
(nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN_MAX,
kr->tx_push_buf_max_len) ||
nla_put_u32(skb, ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN,
- kr->tx_push_buf_len))))
+ kr->tx_push_buf_len))) ||
+ (kr->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
+ (nla_put_u32(skb, ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH,
+ kr->tcp_data_split_thresh))) ||
+ (kr->tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_ENABLED &&
+ (nla_put_u32(skb, ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX,
+ kr->tcp_data_split_thresh_max))))
return -EMSGSIZE;
return 0;
@@ -130,6 +138,7 @@ const struct nla_policy ethnl_rings_set_policy[] = {
[ETHTOOL_A_RINGS_TX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_RX_PUSH] = NLA_POLICY_MAX(NLA_U8, 1),
[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN] = { .type = NLA_U32 },
+ [ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH] = { .type = NLA_U32 },
};
static int
@@ -155,6 +164,14 @@ ethnl_set_rings_validate(struct ethnl_req_info *req_info,
return -EOPNOTSUPP;
}
+ if (tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH] &&
+ !(ops->supported_ring_params & ETHTOOL_RING_USE_TCP_DATA_SPLIT)) {
+ NL_SET_ERR_MSG_ATTR(info->extack,
+ tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH],
+ "setting tcp-data-split-thresh is not supported");
+ return -EOPNOTSUPP;
+ }
+
if (tb[ETHTOOL_A_RINGS_CQE_SIZE] &&
!(ops->supported_ring_params & ETHTOOL_RING_USE_CQE_SIZE)) {
NL_SET_ERR_MSG_ATTR(info->extack,
@@ -196,9 +213,9 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
struct kernel_ethtool_ringparam kernel_ringparam = {};
struct ethtool_ringparam ringparam = {};
struct net_device *dev = req_info->dev;
+ bool mod = false, thresh_mod = false;
struct nlattr **tb = info->attrs;
const struct nlattr *err_attr;
- bool mod = false;
int ret;
dev->ethtool_ops->get_ringparam(dev, &ringparam,
@@ -222,9 +239,30 @@ ethnl_set_rings(struct ethnl_req_info *req_info, struct genl_info *info)
tb[ETHTOOL_A_RINGS_RX_PUSH], &mod);
ethnl_update_u32(&kernel_ringparam.tx_push_buf_len,
tb[ETHTOOL_A_RINGS_TX_PUSH_BUF_LEN], &mod);
- if (!mod)
+ ethnl_update_u32(&kernel_ringparam.tcp_data_split_thresh,
+ tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH],
+ &thresh_mod);
+ if (!mod && !thresh_mod)
return 0;
+ if (kernel_ringparam.tcp_data_split == ETHTOOL_TCP_DATA_SPLIT_DISABLED &&
+ thresh_mod) {
+ NL_SET_ERR_MSG_ATTR(info->extack,
+ tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH],
+ "tcp-data-split-thresh can not be updated while tcp-data-split is disabled");
+ return -EINVAL;
+ }
+
+ if (kernel_ringparam.tcp_data_split_thresh >
+ kernel_ringparam.tcp_data_split_thresh_max) {
+ NL_SET_ERR_MSG_ATTR_FMT(info->extack,
+ tb[ETHTOOL_A_RINGS_TCP_DATA_SPLIT_THRESH_MAX],
+ "Requested tcp-data-split-thresh exceeds the maximum of %u",
+ kernel_ringparam.tcp_data_split_thresh_max);
+
+ return -EINVAL;
+ }
+
/* ensure new ring parameters are within limits */
if (ringparam.rx_pending > ringparam.rx_max_pending)
err_attr = tb[ETHTOOL_A_RINGS_RX];
--
2.34.1
Powered by blists - more mailing lists