[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150318152130.GA19814@leverpostej>
Date: Wed, 18 Mar 2015 15:21:30 +0000
From: Mark Rutland <mark.rutland@....com>
To: Joonsoo Kim <js1304@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Catalin Marinas <Catalin.Marinas@....com>,
Christoph Lameter <cl@...ux.com>,
David Rientjes <rientjes@...gle.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Pekka Enberg <penberg@...nel.org>,
Steve Capper <steve.capper@...aro.org>
Subject: Re: [PATCHv2] mm/slub: fix lockups on PREEMPT && !SMP kernels
Hi,
> > do {
> > tid = this_cpu_read(s->cpu_slab->tid);
> > c = raw_cpu_ptr(s->cpu_slab);
> > - } while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != c->tid));
> > + } while (IS_ENABLED(CONFIG_PREEMPT) &&
> > + unlikely(tid != READ_ONCE(c->tid)));
[...]
> Could you show me generated code again?
The code generated without this patch in !SMP && PREEMPT kernels is:
/* Hoisted load of c->tid */
ffffffc00016d3c4: f9400404 ldr x4, [x0,#8]
/* this_cpu_read(s->cpu_slab->tid)) -- buggy, see [1] */
ffffffc00016d3c8: f9400401 ldr x1, [x0,#8]
ffffffc00016d3cc: eb04003f cmp x1, x4
ffffffc00016d3d0: 54ffffc1 b.ne ffffffc00016d3c8 <slab_alloc_node.constprop.82+0x30>
The code generated with this patch in !SMP && PREEMPT kernels is:
/* this_cpu_read(s->cpu_slab->tid)) -- buggy, see [1] */
ffffffc00016d3c4: f9400401 ldr x1, [x0,#8]
/* load of c->tid */
ffffffc00016d3c8: f9400404 ldr x4, [x0,#8]
ffffffc00016d3cc: eb04003f cmp x1, x4
ffffffc00016d3d0: 54ffffa1 b.ne ffffffc00016d3c4 <slab_alloc_node.constprop.82+0x2c>
Note that with the patch the branch results in both loads being
performed again.
Given that in !SMP kernels we know that the loads _must_ happen on the
same CPU, I think we could go a bit further with the loop condition:
while (IS_ENABLED(CONFIG_PREEMPT) &&
!IS_ENABLED(CONFIG_SMP) &&
unlikely(tid != READ_ONCE(c->tid)));
The barrier afterwards should be sufficient to order the load of the tid
against subsequent accesses to the other cpu_slab fields.
> What we need to check is redoing whole things in the loop.
> Previous attached code seems to me that it already did
> refetching c->tid in the loop and this patch looks only handle
> refetching c->tid.
The refetch in the loop is this_cpu_read(s->cpu_slab->tid), not the load
of c->tid (which is hoisted above the loop).
> READ_ONCE(c->tid) will trigger redoing 'tid = this_cpu_read(s->cpu_slab->tid)'?
I was under the impression that this_cpu operations would always result
in an access, much like the *_ONCE accessors, so we should aways redo
the access for this_cpu_read(s->cpu_slab->tid). Is that not the case?
Mark.
[1] The arm64 this_cpu * operations are currently buggy. We generate the
percpu address into a register, then perform the access with
separate instructions (and could be preempted between the two).
Steve Capper is currently fixing this.
However, the hoisting of the c->tid load could happen regardless,
whenever raw_cpu_ptr(c) can be evaluated at compile time.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists