[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160627122335.GB30154@twins.programming.kicks-ass.net>
Date: Mon, 27 Jun 2016 14:23:35 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Juri Lelli <juri.lelli@....com>
Cc: mingo@...nel.org, tglx@...utronix.de, rostedt@...dmis.org,
xlpang@...hat.com, linux-kernel@...r.kernel.org,
mathieu.desnoyers@...icios.com, jdesfossez@...icios.com,
bristot@...hat.com
Subject: Re: [RFC][PATCH 8/8] rtmutex: Fix PI chain order integrity
On Wed, Jun 15, 2016 at 08:25:07AM +0100, Juri Lelli wrote:
> I guess it's not that likely, but yes it could potentially happen that a
> waiter is optimistically spinning, depletes its runtime, gets throttled
> and then replenished when still spinning. Maybe it doesn't really make
> sense continuing spinning in this situation, but I guess things get
> really complicated. :-/
>
> Anyway, as said, I think this patch is OK. Maybe we want to add a
> comment just to remember what situation can cause an issue if we don't
> do this? Patch changelog would be OK as well for such a comment IMHO.
OK, so I went to write a simple comment and ended up with the below :/
While writing the comment I noticed two issues:
- we update the waiter order fields while the entry is still enqueued
on the pi_waiters tree, which is also sorted by these exact fields.
- another one of these pure ->prio comparisons
Please double check, there be dragons here.
---
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -158,6 +158,9 @@ static inline bool unlock_rt_mutex_safe(
}
#endif
+#define cmp_task(p) \
+ &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline }
+
static inline int
rt_mutex_waiter_less(struct rt_mutex_waiter *left,
struct rt_mutex_waiter *right)
@@ -583,8 +586,26 @@ static int rt_mutex_adjust_prio_chain(st
/* [7] Requeue the waiter in the lock waiter tree. */
rt_mutex_dequeue(lock, waiter);
+
+ /*
+ * Update the waiter prio fields now that we're dequeued.
+ *
+ * These values can have changed through either:
+ *
+ * sys_sched_set_scheduler() / sys_sched_setattr()
+ *
+ * or
+ *
+ * DL CBS enforcement advancing the effective deadline.
+ *
+ * Even though pi_waiters also uses these fields, and that tree is only
+ * updated in [11], we can do this here, since we hold [L], which
+ * serializes all pi_waiters access and rb_erase() does not care about
+ * the values of the node being removed.
+ */
waiter->prio = task->prio;
waiter->deadline = task->dl.deadline;
+
rt_mutex_enqueue(lock, waiter);
/* [8] Release the task */
@@ -711,6 +732,8 @@ static int rt_mutex_adjust_prio_chain(st
static int try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
struct rt_mutex_waiter *waiter)
{
+ lockdep_assert_held(&lock->wait_lock);
+
/*
* Before testing whether we can acquire @lock, we set the
* RT_MUTEX_HAS_WAITERS bit in @lock->owner. This forces all
@@ -770,7 +793,8 @@ static int try_to_take_rt_mutex(struct r
* the top waiter priority (kernel view),
* @task lost.
*/
- if (task->prio >= rt_mutex_top_waiter(lock)->prio)
+ if (!rt_mutex_waiter_less(cmp_task(task),
+ rt_mutex_top_waiter(lock)))
return 0;
/*
@@ -838,6 +862,8 @@ static int task_blocks_on_rt_mutex(struc
struct rt_mutex *next_lock;
int chain_walk = 0, res;
+ lockdep_assert_held(&lock->wait_lock);
+
/*
* Early deadlock detection. We really don't want the task to
* enqueue on itself just to untangle the mess later. It's not
@@ -973,6 +999,8 @@ static void remove_waiter(struct rt_mute
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock;
+ lockdep_assert_held(&lock->wait_lock);
+
raw_spin_lock(¤t->pi_lock);
rt_mutex_dequeue(lock, waiter);
current->pi_blocked_on = NULL;
Powered by blists - more mailing lists