[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180721041439.23358-2-jakub.kicinski@netronome.com>
Date:   Fri, 20 Jul 2018 21:14:39 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     davem@...emloft.net
Cc:     oss-drivers@...ronome.com, netdev@...r.kernel.org,
        Jakub Kicinski <jakub.kicinski@...ronome.com>
Subject: [PATCH net-next 2/2] nfp: avoid buffer leak when FW communication fails
After device is stopped we reset the rings by moving all free buffers
to positions [0, cnt - 2], and clear the position cnt - 1 in the ring.
We then proceed to clear the read/write pointers.  This means that if
we try to reset the ring again the code will assume that the next to
fill buffer is at position 0 and swap it with cnt - 1.  Since we
previously cleared position cnt - 1 it will lead to leaking the first
buffer and leaving ring in a bad state.
This scenario can only happen if FW communication fails, in which case
the ring will never be used again, so the fact it's in a bad state will
not be noticed.  Buffer leak is the only problem.  Don't try to move
buffers in the ring if the read/write pointers indicate the ring was
never used or have already been reset.
nfp_net_clear_config_and_disable() is now fully idempotent.
Found by code inspection, FW communication failures are very rare,
and reconfiguring a live device is not common either, so it's unlikely
anyone has ever noticed the leak.
Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@...ronome.com>
---
This is arguably net material but IMHO the risk of me missing something
this could break is higher than the error actually occurring, and a
page leak on a FW communication error doesn't seem like it's worth
it at -rc6 time..  I'm happy to respin if I'm wrong!
drivers/net/ethernet/netronome/nfp/nfp_net_common.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 279b8ab8a17b..cf1704e972b7 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -1078,7 +1078,7 @@ static bool nfp_net_xdp_complete(struct nfp_net_tx_ring *tx_ring)
  * @dp:		NFP Net data path struct
  * @tx_ring:	TX ring structure
  *
- * Assumes that the device is stopped
+ * Assumes that the device is stopped, must be idempotent.
  */
 static void
 nfp_net_tx_ring_reset(struct nfp_net_dp *dp, struct nfp_net_tx_ring *tx_ring)
@@ -1280,13 +1280,18 @@ static void nfp_net_rx_give_one(const struct nfp_net_dp *dp,
  * nfp_net_rx_ring_reset() - Reflect in SW state of freelist after disable
  * @rx_ring:	RX ring structure
  *
- * Warning: Do *not* call if ring buffers were never put on the FW freelist
- *	    (i.e. device was not enabled)!
+ * Assumes that the device is stopped, must be idempotent.
  */
 static void nfp_net_rx_ring_reset(struct nfp_net_rx_ring *rx_ring)
 {
 	unsigned int wr_idx, last_idx;
 
+	/* wr_p == rd_p means ring was never fed FL bufs.  RX rings are always
+	 * kept at cnt - 1 FL bufs.
+	 */
+	if (rx_ring->wr_p == 0 && rx_ring->rd_p == 0)
+		return;
+
 	/* Move the empty entry to the end of the list */
 	wr_idx = D_IDX(rx_ring, rx_ring->wr_p);
 	last_idx = rx_ring->cnt - 1;
@@ -2508,6 +2513,8 @@ static void nfp_net_vec_clear_ring_data(struct nfp_net *nn, unsigned int idx)
 /**
  * nfp_net_clear_config_and_disable() - Clear control BAR and disable NFP
  * @nn:      NFP Net device to reconfigure
+ *
+ * Warning: must be fully idempotent.
  */
 static void nfp_net_clear_config_and_disable(struct nfp_net *nn)
 {
-- 
2.17.1
Powered by blists - more mailing lists
 
