[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251015114956.GC3419281@noisy.programming.kicks-ass.net>
Date: Wed, 15 Oct 2025 13:49:56 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Florian Westphal <fw@...len.de>
Cc: "Li,Rongqing" <lirongqing@...du.com>,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
"netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
"coreteam@...filter.org" <coreteam@...filter.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH net-next] netfilter: conntrack: Reduce cond_resched
frequency in gc_worker
On Wed, Oct 15, 2025 at 01:06:01PM +0200, Florian Westphal wrote:
> Li,Rongqing <lirongqing@...du.com> wrote:
>
> [ CC scheduler experts & drop netfilter maintainers ]
>
> Context: proposed patch
> (https://patchwork.ozlabs.org/project/netfilter-devel/patch/20251014115103.2678-1-lirongqing@baidu.com/)
> does:
>
> - cond_resched();
> + if (jiffies - resched_time > msecs_to_jiffies(1)) {
> + cond_resched();
> + resched_time = jiffies;
> + }
>
> ... and my knee-jerk reaction was "reject".
>
> But author pointed me at:
> commit 271557de7cbfdecb08e89ae1ca74647ceb57224f
> xfs: reduce the rate of cond_resched calls inside scrub
>
> So:
>
> Is calling cond_resched() unconditionally while walking hashtable/tree etc.
> really discouraged? I see a truckload of cond_resched() calls in similar
> walkers all over networking. I find it hard to believe that conntrack is
> somehow special and should call it only once per ms.
>
> If cond_resched() is really so expensive even just for *checking*
> (retval 0), then maybe we should only call it for every n-th hash slot?
> (every L1_CACHE_BYTES?).
>
> But even in that case it would be good to have a comment or documentation
> entry about recommended usage, or better yet, make a variant of
> xchk_maybe_relax() available via sched.h...
The plan is to remove cond_resched() and friends entirely and have
PREEMPT_LAZY fully replace PREEMPT_VOLUNTARY.
But I don't think we currently have anybody actively working on this.
Ideally distros should be switching to LAZY and report performance vs
VOLUNTARY such that we can try and address things.
Perhaps the thing to do is to just disable VOLUNTARY and see what
happens :-)
Powered by blists - more mailing lists