[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220725121208.GB28662@redhat.com>
Date: Mon, 25 Jul 2022 14:12:09 +0200
From: Oleg Nesterov <onestero@...hat.com>
To: Tejun Heo <tj@...nel.org>
Cc: Christian Brauner <brauner@...nel.org>,
Michal Koutný <mkoutny@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
John Stultz <john.stultz@...aro.org>,
Dmitry Shmidt <dimitrysh@...gle.com>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org
Subject: Re: [PATCH RESEND 3/3 cgroup/for-5.20] cgroup: Make !percpu
threadgroup_rwsem operations optional
On 07/23, Tejun Heo wrote:
>
> +void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
> +{
> + bool favoring = root->flags & CGRP_ROOT_FAVOR_DYNMODS;
> +
> + /* see the comment above CGRP_ROOT_FAVOR_DYNMODS definition */
> + if (favor && !favoring) {
> + rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
> + root->flags |= CGRP_ROOT_FAVOR_DYNMODS;
> + } else if (!favor && favoring) {
> + rcu_sync_exit(&cgroup_threadgroup_rwsem.rss);
> + root->flags &= ~CGRP_ROOT_FAVOR_DYNMODS;
> + }
> +}
I see no problems in this patch. But just for record, we do not need
synchronize_rcu() in the "favor && !favoring" case, so we cab probably
do something like
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -118,7 +118,7 @@ static void rcu_sync_func(struct rcu_head *rhp)
* optimize away the grace-period wait via a state machine implemented
* by rcu_sync_enter(), rcu_sync_exit(), and rcu_sync_func().
*/
-void rcu_sync_enter(struct rcu_sync *rsp)
+void __rcu_sync_enter(struct rcu_sync *rsp, bool wait)
{
int gp_state;
@@ -146,13 +146,20 @@ void rcu_sync_enter(struct rcu_sync *rsp)
* See the comment above, this simply does the "synchronous"
* call_rcu(rcu_sync_func) which does GP_ENTER -> GP_PASSED.
*/
- synchronize_rcu();
- rcu_sync_func(&rsp->cb_head);
- /* Not really needed, wait_event() would see GP_PASSED. */
- return;
+ if (wait) {
+ synchronize_rcu();
+ rcu_sync_func(&rsp->cb_head);
+ } else {
+ rcu_sync_call(rsp);
+ }
+ } else if (wait) {
+ wait_event(rsp->gp_wait, READ_ONCE(rsp->gp_state) >= GP_PASSED);
}
+}
- wait_event(rsp->gp_wait, READ_ONCE(rsp->gp_state) >= GP_PASSED);
+void rcu_sync_enter(struct rcu_sync *rsp)
+{
+ __rcu_sync_enter(rsp, true);
}
/**
later.
__rcu_sync_enter(rsp, false) works just like rcu_sync_enter_start() but it can
be safely called at any moment.
And can't resist, off-topic question... Say, cgroup_attach_task_all() does
mutex_lock(&cgroup_mutex);
percpu_down_write(&cgroup_threadgroup_rwsem);
and this means that synchronize_rcu() can be called with cgroup_mutex held.
Perhaps it makes sense to change this code to do
rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
mutex_lock(&cgroup_mutex);
percpu_down_write(&cgroup_threadgroup_rwsem);
...
percpu_up_write(&cgroup_threadgroup_rwsem);
mutex_unlock(&cgroup_mutex);
rcu_sync_exit(&cgroup_threadgroup_rwsem.rss);
? Just curious.
> - /*
> - * The latency of the synchronize_rcu() is too high for cgroups,
> - * avoid it at the cost of forcing all readers into the slow path.
> - */
> - rcu_sync_enter_start(&cgroup_threadgroup_rwsem.rss);
Note that it doesn't have other users, probably you can kill it.
Oleg.
Powered by blists - more mailing lists