[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191014135832.GO2359@hirez.programming.kicks-ass.net>
Date: Mon, 14 Oct 2019 15:58:32 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Manfred Spraul <manfred@...orfullife.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Davidlohr Bueso <dave@...olabs.net>,
Waiman Long <longman@...hat.com>, 1vier1@....de,
Andrew Morton <akpm@...ux-foundation.org>,
Jonathan Corbet <corbet@....net>
Subject: Re: [PATCH 3/6] ipc/mqueue.c: Update/document memory barriers
On Mon, Oct 14, 2019 at 02:59:11PM +0200, Peter Zijlstra wrote:
> On Sat, Oct 12, 2019 at 07:49:55AM +0200, Manfred Spraul wrote:
>
> > for (;;) {
> > + /* memory barrier not required, we hold info->lock */
> > __set_current_state(TASK_INTERRUPTIBLE);
> >
> > spin_unlock(&info->lock);
> > time = schedule_hrtimeout_range_clock(timeout, 0,
> > HRTIMER_MODE_ABS, CLOCK_REALTIME);
> >
> > + if (READ_ONCE(ewp->state) == STATE_READY) {
> > + /*
> > + * Pairs, together with READ_ONCE(), with
> > + * the barrier in __pipelined_op().
> > + */
> > + smp_acquire__after_ctrl_dep();
> > retval = 0;
> > goto out;
> > }
> > spin_lock(&info->lock);
> > +
> > + /* we hold info->lock, so no memory barrier required */
> > + if (READ_ONCE(ewp->state) == STATE_READY) {
> > retval = 0;
> > goto out_unlock;
> > }
> > @@ -925,14 +933,12 @@ static inline void __pipelined_op(struct wake_q_head *wake_q,
> > list_del(&this->list);
> > wake_q_add(wake_q, this->task);
> > /*
> > + * The barrier is required to ensure that the refcount increase
> > + * inside wake_q_add() is completed before the state is updated.
>
> fails to explain *why* this is important.
>
> > + *
> > + * The barrier pairs with READ_ONCE()+smp_mb__after_ctrl_dep().
> > */
> > + smp_store_release(&this->state, STATE_READY);
>
> You retained the whitespace damage.
>
> And I'm terribly confused by this code, probably due to the lack of
> 'why' as per the above. What is this trying to do?
>
> Are we worried about something like:
>
> A B C
>
>
> wq_sleep()
> schedule_...();
>
> /* spuriuos wakeup */
> wake_up_process(B)
>
> wake_q_add(A)
> if (cmpxchg()) // success
>
> ->state = STATE_READY (reordered)
>
> if (READ_ONCE() == STATE_READY)
> goto out;
>
> exit();
>
>
> get_task_struct() // UaF
>
>
> Can we put the exact and full race in the comment please?
Like Davidlohr already suggested, elsewhere we write it like so:
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -930,15 +930,10 @@ static inline void __pipelined_op(struct
struct mqueue_inode_info *info,
struct ext_wait_queue *this)
{
+ get_task_struct(this->task);
list_del(&this->list);
- wake_q_add(wake_q, this->task);
- /*
- * The barrier is required to ensure that the refcount increase
- * inside wake_q_add() is completed before the state is updated.
- *
- * The barrier pairs with READ_ONCE()+smp_mb__after_ctrl_dep().
- */
- smp_store_release(&this->state, STATE_READY);
+ smp_store_release(&this->state, STATE_READY);
+ wake_q_add_safe(wake_q, this->task);
}
/* pipelined_send() - send a message directly to the task waiting in
Powered by blists - more mailing lists