lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230928234735.3026489-1-clm@fb.com>
Date: Thu, 28 Sep 2023 16:47:35 -0700
From: Chris Mason <clm@...com>
To: <netdev@...r.kernel.org>, <kuba@...nel.org>, <dw@...idwei.uk>,
        <dtatulea@...dia.com>, <saeedm@...dia.com>
Subject: [PATCH RFC] net/mlx5e: avoid page pool frag counter underflow

[ This is just an RFC because I've wandered pretty far from home and
really don't know the code at hand.  The errors are real though, ENOMEM during
mlx5e_refill_rx_wqes() leads to underflows and system instability ]

mlx5e_refill_rx_wqes() has roughly the following flow:

1) mlx5e_free_rx_wqes()
2) mlx5e_alloc_rx_wqes()

We're doing bulk frees before refilling the frags in bulk, and under
normal conditions this is all well balanced.  Every time we try
to refill_rx_wqes, the first thing we do is free the existing ones.

But, if we get an ENOMEM from mlx5e_get_rx_frag(), we will have called
mlx5e_free_rx_wqes() on a bunch of frags without refilling the pages for
them.

mlx5e_page_release_fragmented() doesn't take any steps to remember that
a given frag has been put through page_pool_defrag_page(), and so in the
ENOMEM case, repeated calls to free_rx_wqes without corresponding
allocations end up underflowing in page_pool_defrag_page()

        ret = atomic_long_sub_return(nr, &page->pp_frag_count);
	WARN_ON(ret < 0);

Reproducing this just needs a memory hog driving the system into OOM and
a heavy network rx load.

My guess at a fix is to update our frag to make sure we don't send it
through defrag more than once.  I've only lightly tested this, but it doesn't
immediately crash on OOM anymore and doesn't seem to leak.

Fixes: 6f5742846053c7 ("net/mlx5e: RX, Enable skb page recycling through the page_pool")
Signed-off-by: Chris Mason <clm@...com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 3fd11b0761e0..9a7b10f0bba9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -298,6 +298,16 @@ static void mlx5e_page_release_fragmented(struct mlx5e_rq *rq,
 	u16 drain_count = MLX5E_PAGECNT_BIAS_MAX - frag_page->frags;
 	struct page *page = frag_page->page;
 
+	if (!page)
+		return;
+
+	/*
+	 * we're dropping all of our counts on this page, make sure we
+	 * don't do it again the next time we process this frag
+	 */
+	frag_page->frags = 0;
+	frag_page->page = NULL;
+
 	if (page_pool_defrag_page(page, drain_count) == 0)
 		page_pool_put_defragged_page(rq->page_pool, page, -1, true);
 }
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ