[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250225163549.GB29585@redhat.com>
Date: Tue, 25 Feb 2025 17:35:50 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Waiman Long <longman@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] task_work: Consume only item at a time while invoking
the callbacks.
On 02/25, Frederic Weisbecker wrote:
>
> Le Sun, Feb 23, 2025 at 11:40:15PM +0100, Oleg Nesterov a écrit :
> >
> > I'll try to find and read the previous discussions tomorrow, but iirc Frederic
> > had another solution?
>
> This:
>
> https://lore.kernel.org/all/Zx-B0wK3xqRQsCOS@localhost.localdomain/
>
> Though I'm not entirely happy with it either.
Yes, thanks...
Can we do something else and avoid this rcuwait_wait_event() altogether?
To simplify the discussion, suppose we add a global XXX_LOCK. Just in
case, of course we shouldn't do this ;) But lets suppose we do.
Now, can _something_ like the (incomplete, ugly as hell) patch below work?
Oleg.
---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5304,12 +5304,12 @@ static void perf_pending_task_sync(struct perf_event *event)
return;
}
- /*
- * All accesses related to the event are within the same RCU section in
- * perf_pending_task(). The RCU grace period before the event is freed
- * will make sure all those accesses are complete by then.
- */
- rcuwait_wait_event(&event->pending_work_wait, !event->pending_work, TASK_UNINTERRUPTIBLE);
+ spin_lock(XXX_LOCK);
+ if (event->pending_work) {
+ local_dec(&event->ctx->nr_no_switch_fast);
+ event->pending_work = -1;
+ }
+ spin_unlock(XXX_LOCK);
}
static void _free_event(struct perf_event *event)
@@ -5369,7 +5369,15 @@ static void _free_event(struct perf_event *event)
exclusive_event_destroy(event);
module_put(event->pmu->module);
- call_rcu(&event->rcu_head, free_event_rcu);
+ bool free = true;
+ spin_lock(XXX_LOCK)
+ if (event->pending_work == -1) {
+ event->pending_work = -2;
+ free = false;
+ }
+ spin_unlock(XXX_LOCK);
+ if (free)
+ call_rcu(&event->rcu_head, free_event_rcu);
}
/*
@@ -6981,7 +6989,14 @@ static void perf_pending_task(struct callback_head *head)
{
struct perf_event *event = container_of(head, struct perf_event, pending_task);
int rctx;
+ bool free = false;
+ spin_lock(XXX_LOCK);
+ if ((int)event->pending_work < 0) {
+ free = event->pending_work == -2u;
+ event->pending_work = 0;
+ goto unlock;
+ }
/*
* All accesses to the event must belong to the same implicit RCU read-side
* critical section as the ->pending_work reset. See comment in
@@ -7004,6 +7019,12 @@ static void perf_pending_task(struct callback_head *head)
if (rctx >= 0)
perf_swevent_put_recursion_context(rctx);
+
+unlock:
+ spin_unlock(XXX_LOCK);
+
+ if (free)
+ call_rcu(&event->rcu_head, free_event_rcu);
}
#ifdef CONFIG_GUEST_PERF_EVENTS
Powered by blists - more mailing lists