[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190710225720.58246f8e@oasis.local.home>
Date: Wed, 10 Jul 2019 22:57:20 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Tejun Heo <tj@...nel.org>
Cc: Chris Wilson <chris@...is-wilson.co.uk>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [BUG] lockdep splat with kernfs lockdep annotations and slab
mutex from drm patch??
On Fri, 14 Jun 2019 08:38:37 -0700
Tejun Heo <tj@...nel.org> wrote:
> Hello,
>
> On Fri, Jun 14, 2019 at 04:08:33PM +0100, Chris Wilson wrote:
> > #ifdef CONFIG_MEMCG
> > if (slab_state >= FULL && err >= 0 && is_root_cache(s)) {
> > struct kmem_cache *c;
> >
> > mutex_lock(&slab_mutex);
> >
> > so it happens to hit the error + FULL case with the additional slabcaches?
> >
> > Anyway, according to lockdep, it is dangerous to use the slab_mutex inside
> > slab_attr_store().
>
> Didn't really look into the code but it looks like slab_mutex is held
> while trying to remove sysfs files. sysfs file removal flushes
> on-going accesses, so if a file operation then tries to grab a mutex
> which is held during removal, it leads to a deadlock.
>
Looks like this never got fixed and now this bug is in 5.2.
Just got this:
======================================================
WARNING: possible circular locking dependency detected
5.2.0-test #15 Not tainted
------------------------------------------------------
slub_cpu_partia/899 is trying to acquire lock:
000000000f6f2dd7 (slab_mutex){+.+.}, at: slab_attr_store+0x6d/0xe0
but task is already holding lock:
00000000b23ffe3d (kn->count#160){++++}, at: kernfs_fop_write+0x125/0x230
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (kn->count#160){++++}:
__kernfs_remove+0x413/0x4a0
kernfs_remove_by_name_ns+0x40/0x80
sysfs_slab_add+0x1b5/0x2f0
__kmem_cache_create+0x511/0x560
create_cache+0xcd/0x1f0
kmem_cache_create_usercopy+0x18a/0x240
kmem_cache_create+0x12/0x20
is_active_nid+0xdb/0x230 [snd_hda_codec_generic]
snd_hda_get_path_idx+0x55/0x80 [snd_hda_codec_generic]
get_nid_path+0xc/0x170 [snd_hda_codec_generic]
do_one_initcall+0xa2/0x394
do_init_module+0xfd/0x370
load_module+0x38c6/0x3bd0
__do_sys_finit_module+0x11a/0x1b0
do_syscall_64+0x68/0x250
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (slab_mutex){+.+.}:
lock_acquire+0xbd/0x1d0
__mutex_lock+0xfc/0xb70
slab_attr_store+0x6d/0xe0
kernfs_fop_write+0x170/0x230
vfs_write+0xe1/0x240
ksys_write+0xba/0x150
do_syscall_64+0x68/0x250
entry_SYSCALL_64_after_hwframe+0x49/0xbe
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(kn->count#160);
lock(slab_mutex);
lock(kn->count#160);
lock(slab_mutex);
*** DEADLOCK ***
Attached is a config and the full dmesg.
-- Steve
Download attachment "dmesg" of type "application/octet-stream" (94338 bytes)
Download attachment "config" of type "application/octet-stream" (131477 bytes)
Powered by blists - more mailing lists