lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_eKRNpE7PkdFLQgmfEme5LgEVDK7WakUc-Rj4XTRSpdiQ@mail.gmail.com>
Date:   Tue, 18 Jan 2022 16:00:51 +0800
From:   Xin Long <lucien.xin@...il.com>
To:     Juri Lelli <juri.lelli@...hat.com>
Cc:     Vlastimil Babka <vbabka@...e.cz>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Hyeonggon Yoo <42.hyeyoo@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
        Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Antoine Tenart <atenart@...nel.org>,
        Clark Williams <williams@...hat.com>
Subject: Re: [PATCH] mm: slub: fix a deadlock warning in kmem_cache_destroy

On Mon, Jan 17, 2022 at 9:13 PM Juri Lelli <juri.lelli@...hat.com> wrote:
>
> Hi,
>
> On 17/01/22 13:40, Vlastimil Babka wrote:
> > +CC Clark
> >
> > On 1/17/22 10:33, Sebastian Andrzej Siewior wrote:
> > > On 2022-01-17 16:32:46 [+0800], Xin Long wrote:
> > >> another issue. From the code analysis, this issue does exist on the
> > >> upstream kernel, though I couldn't build an upstream RT kernel for the
> > >> testing.
> > >
> > > This should also reproduce in v5.16 since the commit in question is
> > > there.
> >
> > Yeah. I remember we had some issues with the commit during development, but
> > I'd hope those were resolved and the commit that's ultimately merged got the
> > fixes, see this subthread:
> >
> > https://lore.kernel.org/all/0b36128c-3e12-77df-85fe-a153a714569b@quicinc.com/
> >
> > >> > >         CPU0                        CPU1
> > >> > >         ----                        ----
> > >> > >   cpus_read_lock()
> > >> > >                                    kn->active++
> > >> > >                                    cpus_read_lock() [a]
> > >> > >   wait until kn->active == 0
> > >> > >
> > >> > > Although cpu_hotplug_lock is a RWSEM, [a] will not block in there. But as
> > >> > > lockdep annotations are added for cpu_hotplug_lock, a deadlock warning
> > >> > > would be detected:
> > >
> > > The cpu_hotplug_lock is a per-CPU RWSEM. The lock in [a] will block if
> > > there is a writer pending.
> > >
> > >> > >   ======================================================
> > >> > >   WARNING: possible circular locking dependency detected
> > >> > >   ------------------------------------------------------
> > >> > >   dmsetup/1832 is trying to acquire lock:
> > >> > >   ffff986f5a0f9f20 (kn->count#144){++++}-{0:0}, at: kernfs_remove+0x1d/0x30
> > >> > >
> > >> > >   but task is already holding lock:
> > >> > >   ffffffffa43817c0 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_destroy+0x2a/0x120
> > >> > >
> > >
> > > I tried to create & destroy a cryptarget which creates/destroy a cache
> > > via bio_put_slab(). Either the callchain is different or something else
> > > is but I didn't see a lockdep warning.
> >
> > RHEL-8 kernel seems to be 4.18, unless RT uses a newer one. Could be some
> > silently relevant backport is missing? How about e.g. 59450bbc12be ("mm,
> > slab, slub: stop taking cpu hotplug lock") ?
>
> Hummm, looks like we have backported commit 59450bbc12be in RHEL-8.
>
> Xin Long, would you be able to check if you still see the lockdep splat
> with latest upstream RT?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-5.16.y-rt
Hi, Juri,

Thanks for sharing the RT kernel repo.

I just tried with this kernel, and I couldn't reproduce it on my env.
But I don't see how the upstream RT kernel can avoid the call trace.

As this warning was triggered when the system was shutting down, it might
not be reproduced on it due to some timing change.

>
> Thanks!
> Juri
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ