[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <IA3PR11MB898669A877442CEE90396753E57CA@IA3PR11MB8986.namprd11.prod.outlook.com>
Date: Fri, 20 Jun 2025 10:05:56 +0000
From: "Loktionov, Aleksandr" <aleksandr.loktionov@...el.com>
To: "Keller, Jacob E" <jacob.e.keller@...el.com>, Intel Wired LAN
<intel-wired-lan@...ts.osuosl.org>
CC: "Keller, Jacob E" <jacob.e.keller@...el.com>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "Chittim, Madhu" <madhu.chittim@...el.com>, "Cao,
Yahui" <yahui.cao@...el.com>, "Nguyen, Anthony L"
<anthony.l.nguyen@...el.com>, "Kitszel, Przemyslaw"
<przemyslaw.kitszel@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-next 2/8] ice: add functions to get
and set Tx queue context
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf
> Of Jacob Keller
> Sent: Thursday, June 19, 2025 12:25 AM
> To: Intel Wired LAN <intel-wired-lan@...ts.osuosl.org>
> Cc: Keller, Jacob E <jacob.e.keller@...el.com>;
> netdev@...r.kernel.org; Chittim, Madhu <madhu.chittim@...el.com>; Cao,
> Yahui <yahui.cao@...el.com>; Nguyen, Anthony L
> <anthony.l.nguyen@...el.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@...el.com>
> Subject: [Intel-wired-lan] [PATCH iwl-next 2/8] ice: add functions to
> get and set Tx queue context
>
> The live migration driver will need to save and restore the Tx queue
> context state from the hardware registers. This state contains both
> static fields which do not change during Tx traffic as well as dynamic
> fields which may change during Tx traffic.
>
> Unlike the Rx context, the Tx queue context is accessed indirectly
> from GLCOMM_QTX_CNTX_CTL and GLCOMM_QTX_CNTX_DATA registers. These
> registers are shared by multiple PFs on the same PCIe card. Multiple
> PFs cannot safely access the registers simultaneously, and there is no
> hardware semaphore or logic to control access. To handle this,
> introduce the txq_ctx_lock to the ice_adapter structure. This is
> similar to the ptp_gltsyn_time_lock. All PFs on the same adapter share
> this structure, and use it to serialize access to the registers to
> prevent error.
Is the solution compatible if different PF ports are passed through to different VMs?
> Add a new functions to get and set the Tx queue context through the
> GLCOMM_QTX_CNTX_CTL interface. The hardware context values are stored
> in the registers using the same packed format as the Admin Queue
> buffer.
>
> The hardware buffer is 40 bytes wide, as it contains an additional 18
> bytes of internal state not sent with the Admin Queue buffer. For this
> reason, a separate typedef and packing function must be used. We can
> share the same packed fields definitions because we never need to
> unpack the internal state. This is preferred, as it ensures the
> internal state is zero'd when writing into HW, and avoids issues with
> reading by u32 registers into a buffer of 22 bytes in length. Thanks
> to the typedefs, misuse of the API with the wrong size buffer can
> easily be caught at compile time.
>
> Note reading this data from hardware is essential because the current
> Tx queue context may be different from the context as initially
> programmed by the driver during VF initialization. When migrating a VF
> we must ensure the target VF has identical context as the source VF
> did.
>
> Co-developed-by: Yahui Cao <yahui.cao@...el.com>
> Signed-off-by: Yahui Cao <yahui.cao@...el.com>
> Signed-off-by: Jacob Keller <jacob.e.keller@...el.com>
> Reviewed-by: Madhu Chittim <madhu.chittim@...el.com>
> ---
> drivers/net/ethernet/intel/ice/ice_adapter.h | 2 +
> drivers/net/ethernet/intel/ice/ice_adminq_cmd.h | 14 +-
> drivers/net/ethernet/intel/ice/ice_common.h | 4 +
> drivers/net/ethernet/intel/ice/ice_hw_autogen.h | 12 ++
> drivers/net/ethernet/intel/ice/ice_adapter.c | 1 +
> drivers/net/ethernet/intel/ice/ice_common.c | 173
> +++++++++++++++++++++++-
> 6 files changed, 202 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.h
> b/drivers/net/ethernet/intel/ice/ice_adapter.h
> index
> ac15c0d2bc1a47b17999999713bbbfcb96b7c5a7..1f31b407e125fe6ca7eee4663ea9
> 07878d612b0a 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adapter.h
> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.h
> @@ -38,6 +38,8 @@ struct ice_adapter {
> refcount_t refcount;
> /* For access to the GLTSYN_TIME register */
> spinlock_t ptp_gltsyn_time_lock;
> + /* For access to GLCOMM_QTX_CNTX_CTL register */
> + spinlock_t txq_ctx_lock;
>
> struct ice_pf *ctrl_pf;
> struct ice_port_list ports;
> diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> index
> dc39f0d772297befad1d99bc4fd703c83cb98d78..859b555efa634562fd469f380f27
> 5c92f379d981 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
> @@ -16,11 +16,23 @@
>
> #define ICE_RXQ_CTX_SIZE_DWORDS 8
> #define ICE_RXQ_CTX_SZ (ICE_RXQ_CTX_SIZE_DWORDS *
> sizeof(u32))
> -#define ICE_TXQ_CTX_SZ 22
>
> typedef struct __packed { u8 buf[ICE_RXQ_CTX_SZ]; }
> ice_rxq_ctx_buf_t;
> +
> +/* The Tx queue context is 40 bytes, and includes some internal
> state.
> +The
> + * Admin Queue buffers don't include the internal state, so only
> +include the
> + * first 22 bytes of the context.
> + */
> +#define ICE_TXQ_CTX_SZ 22
> +
> typedef struct __packed { u8 buf[ICE_TXQ_CTX_SZ]; }
> ice_txq_ctx_buf_t;
>
> +#define ICE_TXQ_CTX_FULL_SIZE_DWORDS 10
> +#define ICE_TXQ_CTX_FULL_SZ \
> + (ICE_TXQ_CTX_FULL_SIZE_DWORDS * sizeof(u32))
> +
> +typedef struct __packed { u8 buf[ICE_TXQ_CTX_FULL_SZ]; }
> +ice_txq_ctx_buf_full_t;
> +
> /* Queue Shutdown (direct 0x0003) */
> struct ice_aqc_q_shutdown {
> u8 driver_unloading;
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.h
> b/drivers/net/ethernet/intel/ice/ice_common.h
> index
> 5f15bf83f06a8992f8b260c128df2c625f0bb9f1..0c8705687b99ebaedcad5dcba644
> 32ea85bdbc5d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.h
> +++ b/drivers/net/ethernet/intel/ice/ice_common.h
> @@ -120,6 +120,10 @@ int ice_write_rxq_ctx(struct ice_hw *hw, struct
> ice_rlan_ctx *rlan_ctx,
> u32 rxq_index);
> int ice_read_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx
> *rlan_ctx,
> u32 rxq_index);
> +int ice_read_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx
> *tlan_ctx,
> + u32 txq_index);
> +int ice_write_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx
> *tlan_ctx,
> + u32 txq_index);
>
> int
> ice_aq_get_rss_lut(struct ice_hw *hw, struct
> ice_aq_get_set_rss_lut_params *get_params); diff --git
> a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
> b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
> index
> aa4bfbcf85d28e23678c4401dfd9375ce189f2d3..dd520aa4d1d6aa4b19c501e3b873
> f4f068301db9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
> +++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
> @@ -16,6 +16,7 @@
> #define GLCOMM_QUANTA_PROF_MAX_DESC_M ICE_M(0x3F, 24)
> #define QTX_COMM_DBELL(_DBQM) (0x002C0000 + ((_DBQM)
> * 4))
> #define QTX_COMM_HEAD(_DBQM) (0x000E0000 + ((_DBQM)
> * 4))
> +#define QTX_COMM_HEAD_MAX_INDEX 16383
> #define QTX_COMM_HEAD_HEAD_S 0
> #define QTX_COMM_HEAD_HEAD_M ICE_M(0x1FFF, 0)
> #define PF_FW_ARQBAH 0x00080180
> @@ -272,6 +273,8 @@
> #define VPINT_ALLOC_PCI_VALID_M BIT(31)
> #define VPINT_MBX_CTL(_VSI) (0x0016A000 + ((_VSI)
> * 4))
> #define VPINT_MBX_CTL_CAUSE_ENA_M BIT(30)
> +#define PFLAN_TX_QALLOC(_PF) (0x001D2580 + ((_PF) *
> 4))
> +#define PFLAN_TX_QALLOC_FIRSTQ_M GENMASK(13, 0)
> #define GLLAN_RCTL_0 0x002941F8
> #define QRX_CONTEXT(_i, _QRX) (0x00280000 + ((_i) *
> 8192 + (_QRX) * 4))
> #define QRX_CTRL(_QRX) (0x00120000 + ((_QRX)
> * 4))
> @@ -376,6 +379,15 @@
> #define GLNVM_ULD_POR_DONE_1_M BIT(8)
> #define GLNVM_ULD_PCIER_DONE_2_M BIT(9)
> #define GLNVM_ULD_PE_DONE_M BIT(10)
> +#define GLCOMM_QTX_CNTX_CTL 0x002D2DC8
> +#define GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M GENMASK(13, 0)
> +#define GLCOMM_QTX_CNTX_CTL_CMD_M GENMASK(18, 16)
> +#define GLCOMM_QTX_CNTX_CTL_CMD_READ 0
> +#define GLCOMM_QTX_CNTX_CTL_CMD_WRITE 1
> +#define GLCOMM_QTX_CNTX_CTL_CMD_RESET 3
> +#define GLCOMM_QTX_CNTX_CTL_CMD_WRITE_NO_DYN 4
> +#define GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M BIT(19)
> +#define GLCOMM_QTX_CNTX_DATA(_i) (0x002D2D40 + ((_i) * 4))
> #define GLPCI_CNF2 0x000BE004
> #define GLPCI_CNF2_CACHELINE_SIZE_M BIT(1)
> #define PF_FUNC_RID 0x0009E880
> diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c
> b/drivers/net/ethernet/intel/ice/ice_adapter.c
> index
> 66e070095d1bbe822842d0923e5c44872b0af076..9e4adc43e474c960b8ee4849380a
> 691a0e5ef848 100644
> --- a/drivers/net/ethernet/intel/ice/ice_adapter.c
> +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c
> @@ -32,6 +32,7 @@ static struct ice_adapter *ice_adapter_new(u64 dsn)
>
> adapter->device_serial_number = dsn;
> spin_lock_init(&adapter->ptp_gltsyn_time_lock);
> + spin_lock_init(&adapter->txq_ctx_lock);
> refcount_set(&adapter->refcount, 1);
>
> mutex_init(&adapter->ports.lock);
> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c
> b/drivers/net/ethernet/intel/ice/ice_common.c
> index
> 2800ec4763688c0d194d29686b470e555a457c1c..95e40779b176c0b1e7c8d5f44a0d
> 50b7f66fa0f8 100644
> --- a/drivers/net/ethernet/intel/ice/ice_common.c
> +++ b/drivers/net/ethernet/intel/ice/ice_common.c
> @@ -1513,12 +1513,12 @@ static const struct packed_field_u8
> ice_tlan_ctx_fields[] = { };
>
> /**
> - * ice_pack_txq_ctx - Pack Tx queue context into a HW buffer
> + * ice_pack_txq_ctx - Pack Tx queue context into Admin Queue buffer
> * @ctx: the Tx queue context to pack
> - * @buf: the HW buffer to pack into
> + * @buf: the Admin Queue HW buffer to pack into
> *
> * Pack the Tx queue context from the CPU-friendly unpacked buffer
> into its
> - * bit-packed HW layout.
> + * bit-packed Admin Queue layout.
> */
> void ice_pack_txq_ctx(const struct ice_tlan_ctx *ctx,
> ice_txq_ctx_buf_t *buf) { @@ -1526,6 +1526,173 @@ void
> ice_pack_txq_ctx(const struct ice_tlan_ctx *ctx, ice_txq_ctx_buf_t
> *buf)
> QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST); }
>
> +/**
> + * ice_pack_txq_ctx_full - Pack Tx queue context into a HW buffer
> + * @ctx: the Tx queue context to pack
> + * @buf: the HW buffer to pack into
> + *
> + * Pack the Tx queue context from the CPU-friendly unpacked buffer
> into
> +its
> + * bit-packed HW layout, including the internal data portion.
> + */
> +static void ice_pack_txq_ctx_full(const struct ice_tlan_ctx *ctx,
> + ice_txq_ctx_buf_full_t *buf)
> +{
> + pack_fields(buf, sizeof(*buf), ctx, ice_tlan_ctx_fields,
> + QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST); }
> +
> +/**
> + * ice_unpack_txq_ctx_full - Unpack Tx queue context from a HW buffer
> + * @buf: the HW buffer to unpack from
> + * @ctx: the Tx queue context to unpack
> + *
> + * Unpack the Tx queue context from the HW buffer (including the full
> +internal
> + * state) into the CPU-friendly structure.
> + */
> +static void ice_unpack_txq_ctx_full(const ice_txq_ctx_buf_full_t
> *buf,
> + struct ice_tlan_ctx *ctx)
> +{
> + unpack_fields(buf, sizeof(*buf), ctx, ice_tlan_ctx_fields,
> + QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST); }
> +
> +/**
> + * ice_copy_txq_ctx_from_hw - Copy Tx Queue context from HW registers
> + * @hw: pointer to the hardware structure
> + * @txq_ctx: pointer to the packed Tx queue context, including
> internal
> +state
> + * @txq_index: the index of the Tx queue
> + *
> + * Copy Tx Queue context from HW register space to dense structure
> */
> +static void ice_copy_txq_ctx_from_hw(struct ice_hw *hw,
> + ice_txq_ctx_buf_full_t *txq_ctx,
> + u32 txq_index)
> +{
> + struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
> + u32 *ctx = (u32 *)txq_ctx;
> + u32 txq_base, reg;
> +
> + /* Get Tx queue base within card space */
> + txq_base = rd32(hw, PFLAN_TX_QALLOC(hw->pf_id));
> + txq_base = FIELD_GET(PFLAN_TX_QALLOC_FIRSTQ_M, txq_base);
> +
> + reg = FIELD_PREP(GLCOMM_QTX_CNTX_CTL_CMD_M,
> + GLCOMM_QTX_CNTX_CTL_CMD_READ) |
> + FIELD_PREP(GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M,
> + txq_base + txq_index) |
> + GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M;
> +
> + /* Prevent other PFs on the same adapter from accessing the Tx
> queue
> + * context interface concurrently.
> + */
> + spin_lock(&pf->adapter->txq_ctx_lock);
> +
> + wr32(hw, GLCOMM_QTX_CNTX_CTL, reg);
> + ice_flush(hw);
> +
> + /* Copy each dword separately from HW */
> + for (int i = 0; i < ICE_TXQ_CTX_FULL_SIZE_DWORDS; i++, ctx++) {
> + *ctx = rd32(hw, GLCOMM_QTX_CNTX_DATA(i));
> +
> + ice_debug(hw, ICE_DBG_QCTX, "qtxdata[%d]: %08X\n", i,
> *ctx);
> + }
> +
> + spin_unlock(&pf->adapter->txq_ctx_lock);
> +}
> +
> +/**
> + * ice_copy_txq_ctx_to_hw - Copy Tx Queue context into HW registers
> + * @hw: pointer to the hardware structure
> + * @txq_ctx: pointer to the packed Tx queue context, including
> internal
> +state
> + * @txq_index: the index of the Tx queue */ static void
> +ice_copy_txq_ctx_to_hw(struct ice_hw *hw,
> + const ice_txq_ctx_buf_full_t *txq_ctx,
> + u32 txq_index)
> +{
> + struct ice_pf *pf = container_of(hw, struct ice_pf, hw);
> + u32 txq_base, reg;
> +
> + /* Get Tx queue base within card space */
> + txq_base = rd32(hw, PFLAN_TX_QALLOC(hw->pf_id));
> + txq_base = FIELD_GET(PFLAN_TX_QALLOC_FIRSTQ_M, txq_base);
> +
> + reg = FIELD_PREP(GLCOMM_QTX_CNTX_CTL_CMD_M,
> + GLCOMM_QTX_CNTX_CTL_CMD_WRITE_NO_DYN) |
> + FIELD_PREP(GLCOMM_QTX_CNTX_CTL_QUEUE_ID_M,
> + txq_base + txq_index) |
> + GLCOMM_QTX_CNTX_CTL_CMD_EXEC_M;
> +
> + /* Prevent other PFs on the same adapter from accessing the Tx
> queue
> + * context interface concurrently.
> + */
> + spin_lock(&pf->adapter->txq_ctx_lock);
> +
> + /* Copy each dword separately to HW */
> + for (int i = 0; i < ICE_TXQ_CTX_FULL_SIZE_DWORDS; i++) {
> + u32 ctx = ((const u32 *)txq_ctx)[i];
> +
> + wr32(hw, GLCOMM_QTX_CNTX_DATA(i), ctx);
> +
> + ice_debug(hw, ICE_DBG_QCTX, "qtxdata[%d]: %08X\n", i,
> ctx);
> + }
> +
> + wr32(hw, GLCOMM_QTX_CNTX_CTL, reg);
> + ice_flush(hw);
> +
> + spin_unlock(&pf->adapter->txq_ctx_lock);
> +}
> +
> +/**
> + * ice_read_txq_ctx - Read Tx queue context from HW
> + * @hw: pointer to the hardware structure
> + * @tlan_ctx: pointer to the Tx queue context
> + * @txq_index: the index of the Tx queue
> + *
> + * Read the Tx queue context from the HW registers, then unpack it
> into
> +the
> + * ice_tlan_ctx structure for use.
> + *
> + * Returns: 0 on success, or -EINVAL on an invalid Tx queue index.
> + */
> +int ice_read_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx
> *tlan_ctx,
> + u32 txq_index)
> +{
> + ice_txq_ctx_buf_full_t buf = {};
> +
> + if (txq_index > QTX_COMM_HEAD_MAX_INDEX)
> + return -EINVAL;
> +
> + ice_copy_txq_ctx_from_hw(hw, &buf, txq_index);
> + ice_unpack_txq_ctx_full(&buf, tlan_ctx);
> +
> + return 0;
> +}
> +
> +/**
> + * ice_write_txq_ctx - Write Tx queue context to HW
> + * @hw: pointer to the hardware structure
> + * @tlan_ctx: pointer to the Tx queue context
> + * @txq_index: the index of the Tx queue
> + *
> + * Pack the Tx queue context into the dense HW layout, then write it
> +into the
> + * HW registers.
> + *
> + * Returns: 0 on success, or -EINVAL on an invalid Tx queue index.
> + */
> +int ice_write_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx
> *tlan_ctx,
> + u32 txq_index)
> +{
> + ice_txq_ctx_buf_full_t buf = {};
> +
> + if (txq_index > QTX_COMM_HEAD_MAX_INDEX)
> + return -EINVAL;
> +
> + ice_pack_txq_ctx_full(tlan_ctx, &buf);
> + ice_copy_txq_ctx_to_hw(hw, &buf, txq_index);
> +
> + return 0;
> +}
> +
> /* Sideband Queue command wrappers */
>
> /**
>
> --
> 2.48.1.397.gec9d649cc640
Powered by blists - more mailing lists