[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f4807945-4a7c-c701-86ae-bd2bb01af781@gmail.com>
Date: Thu, 22 Apr 2021 12:45:18 +0300
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Hans Westgaard Ry <hans.westgaard.ry@...cle.com>,
Tariq Toukan <tariqt@...dia.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH v2 net-next] net/mlx4: Treat VFs fair when handling
comm_channel_events
On 4/22/2021 9:21 AM, Hans Westgaard Ry wrote:
> Handling comm_channel_event in mlx4_master_comm_channel uses a double
> loop to determine which slaves have requested work. The search is
> always started at lowest slave. This leads to unfairness; lower VFs
> tends to be prioritized over higher VFs.
>
> The patch uses find_next_bit to determine which slaves to handle.
> Fairness is implemented by always starting at the next to the last
> start.
>
> An MPI program has been used to measure improvements. It runs 500
> ibv_reg_mr, synchronizes with all other instances and then runs 500
> ibv_dereg_mr.
>
> The results running 500 processes, time reported is for running 500
> calls:
>
> ibv_reg_mr:
> Mod. Org.
> mlx4_1 403.356ms 424.674ms
> mlx4_2 403.355ms 424.674ms
> mlx4_3 403.354ms 424.674ms
> mlx4_4 403.355ms 424.674ms
> mlx4_5 403.357ms 424.677ms
> mlx4_6 403.354ms 424.676ms
> mlx4_7 403.357ms 424.675ms
> mlx4_8 403.355ms 424.675ms
>
> ibv_dereg_mr:
> Mod. Org.
> mlx4_1 116.408ms 142.818ms
> mlx4_2 116.434ms 142.793ms
> mlx4_3 116.488ms 143.247ms
> mlx4_4 116.679ms 143.230ms
> mlx4_5 112.017ms 107.204ms
> mlx4_6 112.032ms 107.516ms
> mlx4_7 112.083ms 184.195ms
> mlx4_8 115.089ms 190.618ms
>
> Suggested-by: HÃ¥kon Bugge <haakon.bugge@...cle.com>
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@...cle.com>
> ---
> v1 -> v2:
> - removed set but not user varible,
> reported by 'kernel test robot'
> - fixed reversed Christmas tree,
> removed else,
> removed extra varibles in printout,
> moved next_slave to struct mlx4_mfunc_master_ctx,
> as suggested by Tariq Toukan
> drivers/net/ethernet/mellanox/mlx4/cmd.c | 69 ++++++++++++++++++-------------
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 1 +
> 2 files changed, 41 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> index c678344d22a2..8d751383530b 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
> @@ -2241,43 +2241,52 @@ void mlx4_master_comm_channel(struct work_struct *work)
> struct mlx4_priv *priv =
> container_of(mfunc, struct mlx4_priv, mfunc);
> struct mlx4_dev *dev = &priv->dev;
> - __be32 *bit_vec;
> + u32 lbit_vec[COMM_CHANNEL_BIT_ARRAY_SIZE];
> + u32 nmbr_bits;
> u32 comm_cmd;
> - u32 vec;
> - int i, j, slave;
> + int i, slave;
> int toggle;
> + bool first = true;
> int served = 0;
> int reported = 0;
> u32 slt;
>
> - bit_vec = master->comm_arm_bit_vector;
> - for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++) {
> - vec = be32_to_cpu(bit_vec[i]);
> - for (j = 0; j < 32; j++) {
> - if (!(vec & (1 << j)))
> - continue;
> - ++reported;
> - slave = (i * 32) + j;
> - comm_cmd = swab32(readl(
> - &mfunc->comm[slave].slave_write));
> - slt = swab32(readl(&mfunc->comm[slave].slave_read))
> - >> 31;
> - toggle = comm_cmd >> 31;
> - if (toggle != slt) {
> - if (master->slave_state[slave].comm_toggle
> - != slt) {
> - pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
> - slave, slt,
> - master->slave_state[slave].comm_toggle);
> - master->slave_state[slave].comm_toggle =
> - slt;
> - }
> - mlx4_master_do_cmd(dev, slave,
> - comm_cmd >> 16 & 0xff,
> - comm_cmd & 0xffff, toggle);
> - ++served;
> + for (i = 0; i < COMM_CHANNEL_BIT_ARRAY_SIZE; i++)
> + lbit_vec[i] = be32_to_cpu(master->comm_arm_bit_vector[i]);
> + nmbr_bits = dev->persist->num_vfs + 1;
> + if (++master->next_slave >= nmbr_bits)
> + master->next_slave = 0;
> + slave = master->next_slave;
> + while (true) {
> + slave = find_next_bit((const unsigned long *)&lbit_vec, nmbr_bits, slave);
> + if (!first && slave >= master->next_slave)
> + break;
> + if (slave == nmbr_bits) {
> + if (!first)
> + break;
> + first = false;
> + slave = 0;
> + continue;
> + }
> + ++reported;
> + comm_cmd = swab32(readl(&mfunc->comm[slave].slave_write));
> + slt = swab32(readl(&mfunc->comm[slave].slave_read)) >> 31;
> + toggle = comm_cmd >> 31;
> + if (toggle != slt) {
> + if (master->slave_state[slave].comm_toggle
> + != slt) {
> + pr_info("slave %d out of sync. read toggle %d, state toggle %d. Resynching.\n",
> + slave, slt,
> + master->slave_state[slave].comm_toggle);
> + master->slave_state[slave].comm_toggle =
> + slt;
> }
> + mlx4_master_do_cmd(dev, slave,
> + comm_cmd >> 16 & 0xff,
> + comm_cmd & 0xffff, toggle);
> + ++served;
> }
> + slave++;
> }
>
> if (reported && reported != served)
> @@ -2389,6 +2398,8 @@ int mlx4_multi_func_init(struct mlx4_dev *dev)
> if (!priv->mfunc.master.vf_oper)
> goto err_comm_oper;
>
> + priv->mfunc.master.next_slave = 0;
> +
> for (i = 0; i < dev->num_slaves; ++i) {
> vf_admin = &priv->mfunc.master.vf_admin[i];
> vf_oper = &priv->mfunc.master.vf_oper[i];
> diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> index 64bed7ac3836..6ccf340660d9 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
> @@ -603,6 +603,7 @@ struct mlx4_mfunc_master_ctx {
> struct mlx4_slave_event_eq slave_eq;
> struct mutex gen_eqe_mutex[MLX4_MFUNC_MAX];
> struct mlx4_qos_manager qos_ctl[MLX4_MAX_PORTS + 1];
> + u32 next_slave; /* mlx4_master_comm_channel */
> };
>
> struct mlx4_mfunc {
>
Reviewed-by: Tariq Toukan <tariqt@...dia.com>
Thanks,
Tariq
Powered by blists - more mailing lists