linux-kernel - Re: [PATCH v2] rcu: Reduce synchronize_rcu() delays when all wait heads are in use

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZhUBGkcab10QM_uU@pc636>
Date: Tue, 9 Apr 2024 10:49:30 +0200
From: Uladzislau Rezki <urezki@...il.com>
To: Neeraj Upadhyay <Neeraj.Upadhyay@....com>,
	Frederic Weisbecker <frederic@...nel.org>
Cc: Frederic Weisbecker <frederic@...nel.org>, paulmck@...nel.org,
	joel@...lfernandes.org, urezki@...il.com, josh@...htriplett.org,
	boqun.feng@...il.com, rostedt@...dmis.org,
	mathieu.desnoyers@...icios.com, jiangshanlai@...il.com,
	qiang.zhang1211@...il.com, rcu@...r.kernel.org,
	linux-kernel@...r.kernel.org, neeraj.upadhyay@...nel.org
Subject: Re: [PATCH v2] rcu: Reduce synchronize_rcu() delays when all wait
 heads are in use

Hello, Neeraj, Frederic!

> 
> On 4/5/2024 3:12 AM, Frederic Weisbecker wrote:
> > Le Wed, Apr 03, 2024 at 04:22:12PM +0530, Neeraj Upadhyay a écrit :
> >> When all wait heads are in use, which can happen when
> >> rcu_sr_normal_gp_cleanup_work()'s callback processing
> >> is slow, any new synchronize_rcu() user's rcu_synchronize
> >> node's processing is deferred to future GP periods. This
> >> can result in long list of synchronize_rcu() invocations
> >> waiting for full grace period processing, which can delay
> >> freeing of memory. Mitigate this problem by using first
> >> node in the list as wait tail when all wait heads are in use.
> >> While methods to speed up callback processing would be needed
> >> to recover from this situation, allowing new nodes to complete
> >> their grace period can help prevent delays due to a fixed
> >> number of wait head nodes.
> >>
> >> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@....com>
> > 
> > Looking at it again, I'm not sure if it's a good idea to
> > optimize the thing that far. It's already a tricky state machine
> > to review and the workqueue has SR_NORMAL_GP_WAIT_HEAD_MAX - 1 = 4
> > grace periods worth of time to execute. Such a tense situation may
> > happen of course but, should we really work around that?
> > 
> > I let you guys judge. In the meantime, I haven't found correctness
> 
> I agree that this adds more complexity for handling a scenario
> which is not expected to happen often. Also, this does not help
> much to recover from the situation, as most of the callbacks are still
> blocked on kworker execution. Intent was to keep the patch ready, in
> case we see fixed SR_NORMAL_GP_WAIT_HEAD_MAX  as a blocking factor.
> It's fine from my side if we want to hold off this one. Uladzislau
> what do you think?
> 
I agree with Frederic and we discussed this patch with Neeraj! I think
the state machine is a bit complex as of now. Let's hold off it so far.

--
Uladzislau Rezki