[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1210171046130.26481@file.rdu.redhat.com>
Date: Wed, 17 Oct 2012 11:07:21 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: Lai Jiangshan <laijs@...fujitsu.com>
cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH] percpu-rwsem: use barrier in unlock path
Hi
On Wed, 17 Oct 2012, Lai Jiangshan wrote:
> On 10/17/2012 10:23 AM, Linus Torvalds wrote:
> > [ Architecture people, note the potential new SMP barrier! ]
> >
> > On Tue, Oct 16, 2012 at 4:30 PM, Mikulas Patocka <mpatocka@...hat.com> wrote:
> >> + /*
> >> + * The lock is considered unlocked when p->locked is set to false.
> >> + * Use barrier prevent reordering of operations around p->locked.
> >> + */
> >> +#if defined(CONFIG_X86) && (!defined(CONFIG_X86_PPRO_FENCE) && !defined(CONFIG_X86_OOSTORE))
> >> + barrier();
> >> +#else
> >> + smp_mb();
> >> +#endif
> >> p->locked = false;
> >
> > Ugh. The #if is too ugly to live.
>
> Even the previous patch is applied, percpu_down_read() still
> needs mb() to pair with it.
percpu_down_read uses rcu_read_lock which should guarantee that memory
accesses don't escape in front of a rcu-protected section.
If rcu_read_unlock has only an unlock barrier and not a full barrier,
memory accesses could be moved in front of rcu_read_unlock and reordered
with this_cpu_inc(*p->counters), but it doesn't matter because
percpu_down_write does synchronize_rcu(), so it never sees these accesses
halfway through.
> > This is a classic case of "people who write their own serialization
> > primitives invariably get them wrong". And this fix is just horrible,
> > and code like this should not be allowed.
>
> One of the most major problems of 62ac665ff9fc07497ca524bd20d6a96893d11071 is that
> it is merged without Ackeds or Revieweds from Paul or Peter or someone else
> who are expert at synchronization/arch memory models.
>
> I suggest any new synchronization should stay in -tip for 2 or more cycles
> before merged to mainline.
But the bug that this synchronization is fixing is quite serious (it
causes random crashes when block size is being changed, the crash happens
regularly at multiple important business sites) so it must be fixed soon
and not wait half a year.
> Thanks,
> Lai
Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists