[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241105210338.5364375d@kernel.org>
Date: Tue, 5 Nov 2024 21:03:38 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Joe Damato <jdamato@...tly.com>
Cc: netdev@...r.kernel.org, corbet@....net, hdanton@...a.com,
bagasdotme@...il.com, pabeni@...hat.com, namangulati@...gle.com,
edumazet@...gle.com, amritha.nambiar@...el.com,
sridhar.samudrala@...el.com, sdf@...ichev.me, peter@...eblog.net,
m2shafiei@...terloo.ca, bjorn@...osinc.com, hch@...radead.org,
willy@...radead.org, willemdebruijn.kernel@...il.com, skhawaja@...gle.com,
Martin Karsten <mkarsten@...terloo.ca>, "David S. Miller"
<davem@...emloft.net>, Simon Horman <horms@...nel.org>, David Ahern
<dsahern@...nel.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Lorenzo Bianconi <lorenzo@...nel.org>, Alexander Lobakin
<aleksander.lobakin@...el.com>, linux-kernel@...r.kernel.org (open list)
Subject: Re: [PATCH net-next v6 2/7] net: Suspend softirq when
prefer_busy_poll is set
On Mon, 4 Nov 2024 21:55:26 +0000 Joe Damato wrote:
> From: Martin Karsten <mkarsten@...terloo.ca>
>
> When NAPI_F_PREFER_BUSY_POLL is set during busy_poll_stop and the
> irq_suspend_timeout is nonzero, this timeout is used to defer softirq
> scheduling, potentially longer than gro_flush_timeout. This can be used
> to effectively suspend softirq processing during the time it takes for
> an application to process data and return to the next busy-poll.
>
> The call to napi->poll in busy_poll_stop might lead to an invocation of
The call to napi->poll when we're arming the timer is counter
productive, right? Maybe we can take this opportunity to add
the seemingly missing logic to skip over it?
> napi_complete_done, but the prefer-busy flag is still set at that time,
> so the same logic is used to defer softirq scheduling for
> irq_suspend_timeout.
>
> Signed-off-by: Martin Karsten <mkarsten@...terloo.ca>
> Co-developed-by: Joe Damato <jdamato@...tly.com>
> Signed-off-by: Joe Damato <jdamato@...tly.com>
> Tested-by: Joe Damato <jdamato@...tly.com>
> Tested-by: Martin Karsten <mkarsten@...terloo.ca>
> Acked-by: Stanislav Fomichev <sdf@...ichev.me>
> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@...el.com>
> ---
> v3:
> - Removed reference to non-existent sysfs parameter from commit
> message. No functional/code changes.
>
> net/core/dev.c | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 4d910872963f..51d88f758e2e 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6239,7 +6239,12 @@ bool napi_complete_done(struct napi_struct *n, int work_done)
> timeout = napi_get_gro_flush_timeout(n);
> n->defer_hard_irqs_count = napi_get_defer_hard_irqs(n);
> }
> - if (n->defer_hard_irqs_count > 0) {
> + if (napi_prefer_busy_poll(n)) {
> + timeout = napi_get_irq_suspend_timeout(n);
Why look at the suspend timeout in napi_complete_done()?
We are unlikely to be exiting busy poll here.
Is it because we need more time than gro_flush_timeout
for the application to take over the polling?
> + if (timeout)
> + ret = false;
> + }
> + if (ret && n->defer_hard_irqs_count > 0) {
> n->defer_hard_irqs_count--;
> timeout = napi_get_gro_flush_timeout(n);
> if (timeout)
> @@ -6375,9 +6380,13 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock,
> bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
>
> if (flags & NAPI_F_PREFER_BUSY_POLL) {
> - napi->defer_hard_irqs_count = napi_get_defer_hard_irqs(napi);
> - timeout = napi_get_gro_flush_timeout(napi);
> - if (napi->defer_hard_irqs_count && timeout) {
> + timeout = napi_get_irq_suspend_timeout(napi);
Even here I'm not sure if we need to trigger suspend.
I don't know the eventpoll code well but it seems like you suspend
and resume based on events when exiting epoll. Why also here?
> + if (!timeout) {
> + napi->defer_hard_irqs_count = napi_get_defer_hard_irqs(napi);
> + if (napi->defer_hard_irqs_count)
> + timeout = napi_get_gro_flush_timeout(napi);
> + }
> + if (timeout) {
> hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED);
> skip_schedule = true;
> }
Powered by blists - more mailing lists