linux-kernel - Re: [PATCH v4 net-next 1/4] net: core: page_pool: add user cnt preventing pool deletion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190628085325.GA2795@khorivan>
Date:   Fri, 28 Jun 2019 11:53:26 +0300
From:   Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     davem@...emloft.net, grygorii.strashko@...com, saeedm@...lanox.com,
        leon@...nel.org, ast@...nel.org, linux-kernel@...r.kernel.org,
        linux-omap@...r.kernel.org, ilias.apalodimas@...aro.org,
        netdev@...r.kernel.org, daniel@...earbox.net,
        jakub.kicinski@...ronome.com, john.fastabend@...il.com
Subject: Re: [PATCH v4 net-next 1/4] net: core: page_pool: add user cnt
 preventing pool deletion

On Fri, Jun 28, 2019 at 08:35:20AM +0200, Jesper Dangaard Brouer wrote:
>On Fri, 28 Jun 2019 01:02:47 +0300
>Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org> wrote:
>
>> Hi Jesper, thanks you remember about it.
>>
>> >
>> >I don't think that "create" and "free" routines paring looks "more
>> >correct" together.
>> >
>> >Maybe we can scale back your solution(?), via creating a page_pool_get()
>> >and page_pool_put() API that can be used by your driver, to keep the
>> >page_pool object after a xdp_rxq_info_unreg() call.  Then you can use
>> >it for two xdp_rxq_info structs, and call page_pool_put() after you
>> >have unregistered both.
>> >
>> >The API would basically be:
>> >
>> >diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>> >index b366f59885c1..691ddacfb5a6 100644
>> >--- a/net/core/page_pool.c
>> >+++ b/net/core/page_pool.c
>> >@@ -357,6 +357,10 @@ static void __warn_in_flight(struct page_pool *pool)
>> > void __page_pool_free(struct page_pool *pool)
>> > {
>> >        WARN(pool->alloc.count, "API usage violation");
>> >+
>> >+       if (atomic_read(&pool->user_cnt) != 0)
>> >+               return;
>> >+
>> >        WARN(!ptr_ring_empty(&pool->ring), "ptr_ring is not empty");
>> >
>> >        /* Can happen due to forced shutdown */
>> >@@ -372,6 +376,19 @@ void __page_pool_free(struct page_pool *pool)
>> > }
>> > EXPORT_SYMBOL(__page_pool_free);
>> >
>> >+void page_pool_put(struct page_pool *pool)
>> >+{
>> >+       if (!atomic_dec_and_test(&pool->user_cnt))
>> >+               __page_pool_free(pool);
>> >+}
>> >+EXPORT_SYMBOL(page_pool_put);
>> >+
>> >+void page_pool_get(struct page_pool *pool)
>> >+{
>> >+       atomic_inc(&pool->user_cnt);
>> >+}
>> >+EXPORT_SYMBOL(page_pool_get);
>> >+
>>
>> I have another solution that doesn't touch page pool and adds modifications
>> to xdp allocator. As for me it looks better and work wider, I don't need to
>> think about this in the driver also.
>>
>> It's supposed allocator works as before, no any changes to mlx5 and
>> page_pool API and its usage and seems like fits your requirements.
>> It still supposes that allocator runs under same napi softirq but allows
>> to reuse allocator.
>>
>> I have not verified yet, but looks like:
>>
>> diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
>> index 6a8cba6ea79a..995b21da2f27 100644
>> --- a/include/net/xdp_priv.h
>> +++ b/include/net/xdp_priv.h
>> @@ -18,6 +18,7 @@ struct xdp_mem_allocator {
>>  	struct rcu_head rcu;
>>  	struct delayed_work defer_wq;
>>  	unsigned long defer_warn;
>> +	unsigned long refcnt;
>>  };
>>
>>  #endif /* __LINUX_NET_XDP_PRIV_H__ */
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index f98ab6b98674..6239483e3793 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -98,6 +98,12 @@ static bool __mem_id_disconnect(int id, bool force)
>>  		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
>>  		return true;
>>  	}
>> +
>> +	if (--xa->refcnt) {
>> +		mutex_unlock(&mem_id_lock);
>> +		return true;
>
>This doesn't work.  This function __mem_id_disconnect() can be called
>multiple times. E.g. if there are in-flight packets.
Yes, it was draft. I still have not completely verify it due
to several changes in cpsw and holiday in my side, second draft
looks like:

Subject: [PATCH] net: core: xdp: allow same allocator for rxq

XDP rxq can be same for ndevs running under same rx napi softirq.
But there is no ability to register same allocator for both rxqs,
by fact it's same rxq but has different ndev as a reference.

Due to last changes allocator destroy can be defered till the moment
all packets are recycled by destination interface, afterwards it's
freed. In order to schedule allocator destroy only after all users are
unregistered, add refcnt to allocator object and start to destroy
only it reaches 0.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
---
 include/net/xdp_priv.h |  1 +
 net/core/xdp.c         | 46 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/include/net/xdp_priv.h b/include/net/xdp_priv.h
index 6a8cba6ea79a..995b21da2f27 100644
--- a/include/net/xdp_priv.h
+++ b/include/net/xdp_priv.h
@@ -18,6 +18,7 @@ struct xdp_mem_allocator {
 	struct rcu_head rcu;
 	struct delayed_work defer_wq;
 	unsigned long defer_warn;
+	unsigned long refcnt;
 };
 
 #endif /* __LINUX_NET_XDP_PRIV_H__ */
diff --git a/net/core/xdp.c b/net/core/xdp.c
index f98ab6b98674..7b0185eec124 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -98,6 +98,18 @@ static bool __mem_id_disconnect(int id, bool force)
 		WARN(1, "Request remove non-existing id(%d), driver bug?", id);
 		return true;
 	}
+
+	/* to avoid hash lookup twice do refcnt dec here, but not when
+	 * it's 0 as it can be called from workqueue aftewards
+	 */
+	if (xa->refcnt)
+		xa->refcnt--;
+
+	if (xa->refcnt) {
+		mutex_unlock(&mem_id_lock);
+		return true;
+	}
+
 	xa->disconnect_cnt++;
 
 	/* Detects in-flight packet-pages for page_pool */
@@ -312,6 +324,33 @@ static bool __is_supported_mem_type(enum xdp_mem_type type)
 	return true;
 }
 
+static struct xdp_mem_allocator *xdp_allocator_get(void *allocator)
+{
+	struct xdp_mem_allocator *xae, *xa = NULL;
+	struct rhashtable_iter iter;
+
+	mutex_lock(&mem_id_lock);
+	rhashtable_walk_enter(mem_id_ht, &iter);
+	do {
+		rhashtable_walk_start(&iter);
+
+		while ((xae = rhashtable_walk_next(&iter)) && !IS_ERR(xae)) {
+			if (xae->allocator == allocator) {
+				xae->refcnt++;
+				xa = xae;
+				break;
+			}
+		}
+
+		rhashtable_walk_stop(&iter);
+
+	} while (xae == ERR_PTR(-EAGAIN));
+	rhashtable_walk_exit(&iter);
+	mutex_unlock(&mem_id_lock);
+
+	return xa;
+}
+
 int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 			       enum xdp_mem_type type, void *allocator)
 {
@@ -347,6 +386,12 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 		}
 	}
 
+	xdp_alloc = xdp_allocator_get(allocator);
+	if (xdp_alloc) {
+		xdp_rxq->mem.id = xdp_alloc->mem.id;
+		return 0;
+	}
+
 	xdp_alloc = kzalloc(sizeof(*xdp_alloc), gfp);
 	if (!xdp_alloc)
 		return -ENOMEM;
@@ -360,6 +405,7 @@ int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 	xdp_rxq->mem.id = id;
 	xdp_alloc->mem  = xdp_rxq->mem;
 	xdp_alloc->allocator = allocator;
+	xdp_alloc->refcnt = 1;
 
 	/* Insert allocator into ID lookup table */
 	ptr = rhashtable_insert_slow(mem_id_ht, &id, &xdp_alloc->node);

-- 
Regards,
Ivan Khoronzhuk