[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230328205623.142075-1-saeed@kernel.org>
Date: Tue, 28 Mar 2023 13:56:08 -0700
From: Saeed Mahameed <saeed@...nel.org>
To: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>
Cc: Saeed Mahameed <saeedm@...dia.com>, netdev@...r.kernel.org,
Tariq Toukan <tariqt@...dia.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Toke Høiland-Jørgensen <toke@...hat.com>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>
Subject: [pull request][net-next 00/15] mlx5: Drop internal page cache implementation
From: Saeed Mahameed <saeedm@...dia.com>
Hi Dave, Hi Jakub,
This series from Dragos provides the patches that remove the mlx5
internal page cache implementation and convert mlx5 RX buffers to
completely rely on the standard page pool.
For more information please see tag log below.
Please pull and let me know if there is any problem.
Thanks,
Saeed.
The following changes since commit 86e2eca4ddedc07d639c44c990e1c220cac3741e:
net: ethernet: ti: am65-cpsw: enable p0 host port rx_vlan_remap (2023-03-28 15:29:50 +0200)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-updates-2023-03-28
for you to fetch changes up to 3905f8d64ccc2c640d8c1179f4452f2bf8f1df56:
net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats (2023-03-28 13:43:59 -0700)
----------------------------------------------------------------
mlx5-updates-2023-03-28
Dragos Tatulea says:
====================
net/mlx5e: RX, Drop page_cache and fully use page_pool
For page allocation on the rx path, the mlx5e driver has been using an
internal page cache in tandem with the page pool. The internal page
cache uses a queue for page recycling which has the issue of head of
queue blocking.
This patch series drops the internal page_cache altogether and uses the
page_pool to implement everything that was done by the page_cache
before:
* Let the page_pool handle dma mapping and unmapping.
* Use fragmented pages with fragment counter instead of tracking via
page ref.
* Enable skb recycling.
The patch series has the following effects on the rx path:
* Improved performance for the cases when there was low page recycling
due to head of queue blocking in the internal page_cache. The test
for this was running a single iperf TCP stream to a rx queue
which is bound on the same cpu as the application.
|-------------+--------+--------+------+---------|
| rq type | before | after | unit | diff |
|-------------+--------+--------+------+---------|
| striding rq | 30.1 | 31.4 | Gbps | 4.14 % |
| legacy rq | 30.2 | 33.0 | Gbps | 8.48 % |
|-------------+--------+--------+------+---------|
* Small XDP performance degradation. The test was is XDP drop
program running on a single rx queue with small packets incoming
it looks like this:
|-------------+----------+----------+------+---------|
| rq type | before | after | unit | diff |
|-------------+----------+----------+------+---------|
| striding rq | 19725449 | 18544617 | pps | -6.37 % |
| legacy rq | 19879931 | 18631841 | pps | -6.70 % |
|-------------+----------+----------+------+---------|
This will be handled in a different patch series by adding support for
multi-packet per page.
* For other cases the performance is roughly the same.
The above numbers were obtained on the following system:
24 core Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
32 GB RAM
ConnectX-7 single port
The breakdown on the patch series is the following:
* Preparations for introducing the mlx5e_frag_page struct.
* Delete the mlx5e_page_cache struct.
* Enable dma mapping from page_pool.
* Enable skb recycling and fragment counting.
* Do deferred release of pages (just before alloc) to ensure better
page_pool cache utilization.
====================
----------------------------------------------------------------
Dragos Tatulea (15):
net/mlx5e: RX, Remove mlx5e_alloc_unit argument in page allocation
net/mlx5e: RX, Remove alloc unit layout constraint for legacy rq
net/mlx5e: RX, Remove alloc unit layout constraint for striding rq
net/mlx5e: RX, Store SHAMPO header pages in array
net/mlx5e: RX, Remove internal page_cache
net/mlx5e: RX, Enable dma map and sync from page_pool allocator
net/mlx5e: RX, Enable skb page recycling through the page_pool
net/mlx5e: RX, Rename xdp_xmit_bitmap to a more generic name
net/mlx5e: RX, Defer page release in striding rq for better recycling
net/mlx5e: RX, Change wqe last_in_page field from bool to bit flags
net/mlx5e: RX, Defer page release in legacy rq for better recycling
net/mlx5e: RX, Split off release path for xsk buffers for legacy rq
net/mlx5e: RX, Increase WQE bulk size for legacy rq
net/mlx5e: RX, Break the wqe bulk refill in smaller chunks
net/mlx5e: RX, Remove unnecessary recycle parameter and page_cache stats
.../ethernet/mellanox/mlx5/counters.rst | 26 --
drivers/net/ethernet/mellanox/mlx5/core/en.h | 51 ++-
.../net/ethernet/mellanox/mlx5/core/en/params.c | 53 ++-
.../ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 +-
drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 6 +-
drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 10 +-
.../net/ethernet/mellanox/mlx5/core/en/xsk/rx.c | 54 +--
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 167 +++++---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 452 +++++++++++----------
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 20 -
drivers/net/ethernet/mellanox/mlx5/core/en_stats.h | 10 -
11 files changed, 464 insertions(+), 389 deletions(-)
Powered by blists - more mailing lists