[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1296508677.26581.84.camel@laptop>
Date: Mon, 31 Jan 2011 22:17:57 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Milton Miller <miltonm@....com>
Cc: akpm@...ux-foundation.org, Anton Blanchard <anton@...ba.org>,
xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
npiggin@...il.com, rusty@...tcorp.com.au,
torvalds@...ux-foundation.org, paulmck@...ux.vnet.ibm.com,
benh@...nel.crashing.org, linux-kernel@...r.kernel.org
Subject: Re: call_function_many: fix list delete vs add race
On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> @@ -491,15 +491,17 @@ void smp_call_function_many(const struct
> cpumask_clear_cpu(this_cpu, data->cpumask);
>
> /*
> - * To ensure the interrupt handler gets an complete view
> - * we order the cpumask and refs writes and order the read
> - * of them in the interrupt handler. In addition we may
> - * only clear our own cpu bit from the mask.
> + * We reuse the call function data without waiting for any grace
> + * period after some other cpu removes it from the global queue.
> + * This means a cpu might find our data block as it is writen.
> + * The interrupt handler waits until it sees refs filled out
> + * while its cpu mask bit is set; here we may only clear our
> + * own cpu mask bit, and must wait to set refs until we are sure
> + * previous writes are complete and we have obtained the lock to
> + * add the element to the queue. We use the acquire and release
> + * of the lock as a wmb() -- acquire prevents write moving up and
> + * release requires old writes are visible.
That's wrong:
->foo =
LOCK
UNLOCK
->bar =
can be re-ordered as:
LOCK
->bar =
->foo =
UNLOCK
Only a UNLOCK+LOCK sequence can be considered an MB.
However, I think the code is still OK, because list_add_rcu() implies a
wmb(), so in that respect its an improvement since we fix a race and
avoid an extra wmb. But the comment needs an update.
> */
> - smp_wmb();
> -
> - atomic_set(&data->refs, cpumask_weight(data->cpumask));
> -
> raw_spin_lock_irqsave(&call_function.lock, flags);
> /*
> * Place entry at the _HEAD_ of the list, so that any cpu still
> @@ -509,6 +511,8 @@ void smp_call_function_many(const struct
> list_add_rcu(&data->csd.list, &call_function.queue);
> raw_spin_unlock_irqrestore(&call_function.lock, flags);
And this wants to grow a comment that it relies on the wmb implied by
list_add_rcu()
> + atomic_set(&data->refs, cpumask_weight(data->cpumask));
> +
> /*
> * Make the list addition visible before sending the ipi.
> * (IPIs must obey or appear to obey normal Linux cache
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists