[<prev] [next>] [day] [month] [year] [list]
Message-ID: <0b411583-72f8-54c1-dc48-c270e1ed8ac7@huawei.com>
Date: Thu, 19 Jan 2023 20:24:53 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: <linux-rdma@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: <jgg@...pe.ca>, Leon Romanovsky <leon@...nel.org>,
Saeed Mahameed <saeedm@...dia.com>, <xuhaoyue1@...ilicon.com>,
"liyangyang20@...wei.com" <liyangyang20@...wei.com>,
<will@...nel.org>, Peter Zijlstra <peterz@...radead.org>
Subject: Question about ordering between cq polling and notifying hw
Hi, ALL
After polling cq, usually the driver need to notify the hw with new ci.
When I look through the drivers implementing the cq polling and using record
db[1], there seems to be no memory barrier between parsing the valid cqe and
notifying the hw with new ci:
For ib mlx5 driver, it always use the record db to notify the hw and there
is no memory barrier parsing the valid cqe and notifying the hw with new ci:
https://elixir.bootlin.com/linux/v6.2-rc4/source/drivers/infiniband/hw/mlx5/cq.c#L637
For ib hns driver, it supports both record db and normal db, and there is
memory barrier when using writeq to ring the normal db, but it does not
have memory barrier for record db:
https://elixir.bootlin.com/linux/v6.2-rc4/source/drivers/infiniband/hw/hns/hns_roce_hw_v2.c#L4136
For ethernet mlx5 driver, for the tx cq polling, It does have a memory
barrier, but it is placed after notifying the hw with new ci, not between
parsing the valid cqe and notifying the hw with new ci:
https://elixir.bootlin.com/linux/v6.2-rc4/source/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c#L872
Do we need to ensure ordering betwwen parsing the valid cqe and notifying
the hw with new ci? If there is no ordering, will the recodering cause the
cpu to notify the hw with new ci before parsing the last valid cqe, casuing
the hw writing the same cqe while the driver is parsing it?
For ethernet mlx5 driver, even there is a comment above the wmb() barrier,
I am not sure I understand what ordering does the memory barrier enforce
and why that ordering is needed:
bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
{
......................
parsing the valid cqe
......................
mlx5_cqwq_update_db_record(&cq->wq);
/* ensure cq space is freed before enabling more cqes */
wmb();
sq->dma_fifo_cc = dma_fifo_cc;
sq->cc = sqcc;
netdev_tx_completed_queue(sq->txq, npkts, nbytes);
if (netif_tx_queue_stopped(sq->txq) &&
mlx5e_wqc_has_room_for(&sq->wq, sq->cc, sq->pc, sq->stop_room) &&
mlx5e_ptpsq_fifo_has_room(sq) &&
!test_bit(MLX5E_SQ_STATE_RECOVERING, &sq->state)) {
netif_tx_wake_queue(sq->txq);
stats->wake++;
}
return (i == MLX5E_TX_CQ_POLL_BUDGET);
}
1. Doorbell records are located in physical memory. The address of DoorBell record
is passed to the HW at RQ/SQ creation. see:
https://network.nvidia.com/files/doc-2020/ethernet-adapters-programming-manual.pdf
Powered by blists - more mailing lists