linux-kernel - [LOCKDEP BUG] from slub: separate out sysfs_slab_release() from sysfs_slab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170616085507.3cc7d4b8@gandalf.local.home>
Date:   Fri, 16 Jun 2017 08:55:07 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Tejun Heo <tj@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Christoph Lameter <cl@...ux.com>,
        Pekka Enberg <penberg@...nel.org>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: [LOCKDEP BUG] from slub: separate out sysfs_slab_release() from
 sysfs_slab_remove()

In my tests, I hit the following lockdep splat:

 
 ======================================================
 [ INFO: possible circular locking dependency detected ]
 4.10.0-test+ #48 Not tainted
 -------------------------------------------------------
 rmmod/1211 is trying to acquire lock:
  (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40
 
 but task is already holding lock:
  (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 -> #1 (slab_mutex){+.+.+.}:
        lock_acquire+0xf6/0x1f0
        __mutex_lock+0x75/0x950
        mutex_lock_nested+0x1b/0x20
        slab_attr_store+0x75/0xd0
        sysfs_kf_write+0x45/0x60
        kernfs_fop_write+0x13c/0x1c0
        __vfs_write+0x28/0x120
        vfs_write+0xc8/0x1e0
        SyS_write+0x49/0xa0
        entry_SYSCALL_64_fastpath+0x1f/0xc2
 
 -> #0 (s_active#120){++++.+}:
        __lock_acquire+0x10ed/0x1260
        lock_acquire+0xf6/0x1f0
        __kernfs_remove+0x254/0x320
        kernfs_remove+0x23/0x40
        sysfs_remove_dir+0x51/0x80
        kobject_del+0x18/0x50
        __kmem_cache_shutdown+0x3e6/0x460
        kmem_cache_destroy+0x1fb/0x2d0
        kvm_exit+0x2d/0x80 [kvm]
        vmx_exit+0x19/0xa1b [kvm_intel]
        SyS_delete_module+0x198/0x1f0
        entry_SYSCALL_64_fastpath+0x1f/0xc2
 
 other info that might help us debug this:
 
  Possible unsafe locking scenario:
 
        CPU0                    CPU1
        ----                    ----
   lock(slab_mutex);
                                lock(s_active#120);
                                lock(slab_mutex);
   lock(s_active#120);
 
  *** DEADLOCK ***
 
 2 locks held by rmmod/1211:
  #0:  (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80
  #1:  (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0
 
 stack backtrace:
 CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
 Call Trace:
  dump_stack+0x86/0xc3
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x10ed/0x1260
  ? 0xffffffffa0000077
  lock_acquire+0xf6/0x1f0
  ? kernfs_remove+0x23/0x40
  __kernfs_remove+0x254/0x320
  ? kernfs_remove+0x23/0x40
  ? __kernfs_remove+0x5/0x320
  kernfs_remove+0x23/0x40
  sysfs_remove_dir+0x51/0x80
  kobject_del+0x18/0x50
  __kmem_cache_shutdown+0x3e6/0x460
  kmem_cache_destroy+0x1fb/0x2d0
  kvm_exit+0x2d/0x80 [kvm]
  vmx_exit+0x19/0xa1b [kvm_intel]
  SyS_delete_module+0x198/0x1f0
  ? SyS_delete_module+0x5/0x1f0
  entry_SYSCALL_64_fastpath+0x1f/0xc2
 RIP: 0033:0x7fdf0439f487
 RSP: 002b:00007ffcd39eb6b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007fdf0439f487
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055e4e32f3258
 RBP: 0000000000000000 R08: 000000000000000a R09: 0000000000000000
 R10: 00007fdf0440ace0 R11: 0000000000000206 R12: 000055e4e32f31f0
 R13: 00007ffcd39ea6a0 R14: 0000000000000000 R15: 000055e4e32f31f0


I bisected it down to commit bf5eb3de3847ebcfd

"slub: separate out sysfs_slab_release() from sysfs_slab_remove()"

To hit this bug, I simply had to log in and perform:

 # rmmod kvm_intel

and it triggered.

It looks as though this commit was added to allow for other changes, so
just reverting it wont work.

I attached the config that triggers this too.

-- Steve

Download attachment "config.gz" of type "application/gzip" (28205 bytes)