linux-kernel - Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations __free

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2eb3cf340716c40f03a0a342ab40219b3d1de195.camel@gmx.de>
Date:   Tue, 10 Aug 2021 13:47:42 +0200
From:   Mike Galbraith <efault@....de>
To:     Vlastimil Babka <vbabka@...e.cz>,
        Qian Cai <quic_qiancai@...cinc.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...ux.com>,
        David Rientjes <rientjes@...gle.com>,
        Pekka Enberg <penberg@...nel.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH v4 29/35] mm: slub: Move flush_cpu_slab() invocations
 __free_slab() invocations out of IRQ context

On Tue, 2021-08-10 at 11:03 +0200, Vlastimil Babka wrote:
> On 8/9/21 3:41 PM, Qian Cai wrote:
> > >  
> > > +static DEFINE_MUTEX(flush_lock);
> > > +static DEFINE_PER_CPU(struct slub_flush_work, slub_flush);
> > > +
> > >  static void flush_all(struct kmem_cache *s)
> > >  {
> > > -       on_each_cpu_cond(has_cpu_slab, flush_cpu_slab, s, 1);
> > > +       struct slub_flush_work *sfw;
> > > +       unsigned int cpu;
> > > +
> > > +       mutex_lock(&flush_lock);
> >
> > Vlastimil, taking the lock here could trigger a warning during memory offline/online due to the locking order:
> >
> > slab_mutex -> flush_lock
> >
> > [   91.374541] WARNING: possible circular locking dependency detected
> > [   91.381411] 5.14.0-rc5-next-20210809+ #84 Not tainted
> > [   91.387149] ------------------------------------------------------
> > [   91.394016] lsbug/1523 is trying to acquire lock:
> > [   91.399406] ffff800018e76530 (flush_lock){+.+.}-{3:3}, at: flush_all+0x50/0x1c8
> > [   91.407425]
> >                but task is already holding lock:
> > [   91.414638] ffff800018e48468 (slab_mutex){+.+.}-{3:3}, at: slab_memory_callback+0x44/0x280
> > [   91.423603]
> >                which lock already depends on the new lock.
> >
>
> OK, managed to reproduce in qemu and this fixes it for me on top of
> next-20210809. Could you test as well, as your testing might be more
> comprehensive? I will format is as a fixup for the proper patch in the series then.

As it appeared it should, moving cpu_hotplug_lock outside slab_mutex in
kmem_cache_destroy() on top of that silenced the cpu offline gripe.

---
 mm/slab_common.c |    2 ++
 mm/slub.c        |    2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -502,6 +502,7 @@ void kmem_cache_destroy(struct kmem_cach
 	if (unlikely(!s))
 		return;

+	cpus_read_lock();
 	mutex_lock(&slab_mutex);

 	s->refcount--;
@@ -516,6 +517,7 @@ void kmem_cache_destroy(struct kmem_cach
 	}
 out_unlock:
 	mutex_unlock(&slab_mutex);
+	cpus_read_unlock();
 }
 EXPORT_SYMBOL(kmem_cache_destroy);

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4234,7 +4234,7 @@ int __kmem_cache_shutdown(struct kmem_ca
 	int node;
 	struct kmem_cache_node *n;

-	flush_all(s);
+	flush_all_cpus_locked(s);
 	/* Attempt to free all objects */
 	for_each_kmem_cache_node(s, node, n) {
 		free_partial(s, n);