[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLCv0f3vBYt8W+_ZDuNeOY1jDLDBfMbOj7Hzi8s0xQCZA@mail.gmail.com>
Date: Fri, 1 Mar 2024 09:30:32 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Yan Zhai <yan@...udflare.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Jiri Pirko <jiri@...nulli.us>,
Simon Horman <horms@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Lorenzo Bianconi <lorenzo@...nel.org>, Coco Li <lixiaoyan@...gle.com>, Wei Wang <weiwan@...gle.com>,
Alexander Duyck <alexanderduyck@...com>, Hannes Frederic Sowa <hannes@...essinduktion.org>,
linux-kernel@...r.kernel.org, rcu@...r.kernel.org, bpf@...r.kernel.org,
kernel-team@...udflare.com, Joel Fernandes <joel@...lfernandes.org>,
"Paul E. McKenney" <paulmck@...nel.org>, Toke Høiland-Jørgensen <toke@...hat.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>, Steven Rostedt <rostedt@...dmis.org>, mark.rutland@....com
Subject: Re: [PATCH v2] net: raise RCU qs after each threaded NAPI poll
On Fri, Mar 1, 2024 at 4:50 AM Yan Zhai <yan@...udflare.com> wrote:
>
> On Thu, Feb 29, 2024 at 9:47 PM Yan Zhai <yan@...udflare.com> wrote:
> >
> > We noticed task RCUs being blocked when threaded NAPIs are very busy at
> > workloads: detaching any BPF tracing programs, i.e. removing a ftrace
> > trampoline, will simply block for very long in rcu_tasks_wait_gp. This
> > ranges from hundreds of seconds to even an hour, severely harming any
...
> >
> > Fixes: 29863d41bb6e ("net: implement threaded-able napi poll loop support")
> > Suggested-by: Paul E. McKenney <paulmck@...nel.org>
> > Reviewed-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > Signed-off-by: Yan Zhai <yan@...udflare.com>
> > ---
> > v1->v2: moved rcu_softirq_qs out from bh critical section, and only
> > raise it after a second of repolling. Added some brief perf test result.
> >
> Link to v1: https://lore.kernel.org/netdev/Zd4DXTyCf17lcTfq@debian.debian/T/#u
> And I apparently forgot to rename the subject since it's not raising
> after every poll (let me know if it is prefered to send a V3 to fix
> it)
>
I could not see the reason for 1sec (HZ) delays.
Would calling rcu_softirq_qs() every ~10ms instead be a serious issue ?
In anycase, if this all about rcu_tasks, I would prefer using a macro
defined in kernel/rcu/tasks.h
instead of having a hidden constant in a networking core function.
Thanks.
Powered by blists - more mailing lists