[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250214-grinning-upbeat-chowchow-5c0e2f@leitao>
Date: Fri, 14 Feb 2025 08:43:28 -0800
From: Breno Leitao <leitao@...ian.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Frederic Weisbecker <frederic@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Boqun Feng <boqun.feng@...il.com>, Waiman Long <longman@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, Hayes Wang <hayeswang@...ltek.com>,
linux-usb@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] net: Assert proper context while calling
napi_schedule()
Hello Jakub,
On Thu, Feb 13, 2025 at 07:14:26AM -0800, Jakub Kicinski wrote:
> ... How about we add an hrtimer to netdevsim,
> schedule it to fire 5usec in the future instead of scheduling NAPI
> immediately? We can call napi_schedule() from a timer safely.
I hacked a way to do so. Is this what you had in mind?
Author: Breno Leitao <leitao@...ian.org>
Date: Wed Feb 12 09:50:51 2025 -0800
netdevsim: call napi_schedule from a timer context
The netdevsim driver was experiencing NOHZ tick-stop errors during packet
transmission due to pending softirq work when calling napi_schedule().
This issue was observed when running the netconsole selftest, which
triggered the following error message:
NOHZ tick-stop error: local softirq work is pending, handler #08!!!
To fix this issue, introduce a timer that schedules napi_schedule()
from a timer context instead of calling it directly from the TX path.
Create an hrtimer for each queue and kick it from the TX path,
which then schedules napi_schedule() from the timer context.
Suggested-by: Jakub Kicinski <kuba@...nel.org>
Signed-off-by: Breno Leitao <leitao@...ian.org>
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 42f247cbdceec..cd56904a39049 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -87,7 +87,7 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
if (unlikely(nsim_forward_skb(peer_dev, skb, rq) == NET_RX_DROP))
goto out_drop_cnt;
- napi_schedule(&rq->napi);
+ hrtimer_start(&rq->napi_timer, ns_to_ktime(5), HRTIMER_MODE_REL);
rcu_read_unlock();
u64_stats_update_begin(&ns->syncp);
@@ -426,6 +426,25 @@ static int nsim_init_napi(struct netdevsim *ns)
return err;
}
+static enum hrtimer_restart nsim_napi_schedule(struct hrtimer *timer)
+{
+ struct nsim_rq *rq;
+
+ rq = container_of(timer, struct nsim_rq, napi_timer);
+ napi_schedule(&rq->napi);
+ /* TODO: Should HRTIMER_RESTART be returned if napi_schedule returns
+ * false?
+ */
+
+ return HRTIMER_NORESTART;
+}
+
+static void nsim_rq_timer_init(struct nsim_rq *rq)
+{
+ hrtimer_init(&rq->napi_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+ rq->napi_timer.function = nsim_napi_schedule;
+}
+
static void nsim_enable_napi(struct netdevsim *ns)
{
struct net_device *dev = ns->netdev;
@@ -436,6 +455,7 @@ static void nsim_enable_napi(struct netdevsim *ns)
netif_queue_set_napi(dev, i, NETDEV_QUEUE_TYPE_RX, &rq->napi);
napi_enable(&rq->napi);
+ nsim_rq_timer_init(rq);
}
}
@@ -461,6 +481,7 @@ static void nsim_del_napi(struct netdevsim *ns)
for (i = 0; i < dev->num_rx_queues; i++) {
struct nsim_rq *rq = ns->rq[i];
+ hrtimer_cancel(&rq->napi_timer);
napi_disable(&rq->napi);
__netif_napi_del(&rq->napi);
}
diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
index dcf073bc4802e..2b396c517ac1d 100644
--- a/drivers/net/netdevsim/netdevsim.h
+++ b/drivers/net/netdevsim/netdevsim.h
@@ -97,6 +97,7 @@ struct nsim_rq {
struct napi_struct napi;
struct sk_buff_head skb_queue;
struct page_pool *page_pool;
+ struct hrtimer napi_timer;
};
struct netdevsim {
Powered by blists - more mailing lists