lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 26 Apr 2023 14:17:03 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Tejun Heo <tj@...nel.org>
Cc:     rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...a.com, rostedt@...dmis.org, riel@...riel.com
Subject: Re: [PATCH RFC rcu] Stop rcu_tasks_invoke_cbs() from using
 never-online CPUs

On Wed, Apr 26, 2023 at 09:55:55AM -1000, Tejun Heo wrote:
> Hello, Paul.
> 
> On Wed, Apr 26, 2023 at 10:26:38AM -0700, Paul E. McKenney wrote:
> > The rcu_tasks_invoke_cbs() relies on queue_work_on() to silently fall
> > back to WORK_CPU_UNBOUND when the specified CPU is offline.  However,
> > the queue_work_on() function's silent fallback mechanism relies on that
> > CPU having been online at some time in the past.  When queue_work_on()
> > is passed a CPU that has never been online, workqueue lockups ensue,
> > which can be bad for your kernel's general health and well-being.
> > 
> > This commit therefore checks whether a given CPU is currently online,
> > and, if not substitutes WORK_CPU_UNBOUND in the subsequent call to
> > queue_work_on().  Why not simply omit the queue_work_on() call entirely?
> > Because this function is flooding callback-invocation notifications
> > to all CPUs, and must deal with possibilities that include a sparse
> > cpu_possible_mask.
> > 
> > Fixes: d363f833c6d88 rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations
> > Reported-by: Tejun Heo <tj@...nel.org>
> > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> 
> I don't understand the code at all but wonder whether it can do sth similar
> to cpumask_any_distribute() which RR's through the specified cpumask. Would
> it make sense to change rcu_tasks_invoke_cbs() to do something similar
> against cpu_online_mask?

It might.

But the idea here is to spread the load of queueing the work as well as
spreading the load of invoking the callbacks.

I suppose that I could allocate an array of ints, gather the online CPUs
into that array, and do a power-of-two distribution across that array.
But RCU Tasks allows CPUs to go offline with queued callbacks, so this
array would also need to include those CPUs as well as the ones that
are online.

Given that the common-case system has a dense cpus_online_mask, I opted
to keep it simple, which is optimal in the common case.

Or am I missing a trick here?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ