lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170908154136.096d49ff@gandalf.local.home>
Date:   Fri, 8 Sep 2017 15:41:36 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Tom Zanussi <tom.zanussi@...ux.intel.com>
Cc:     tglx@...utronix.de, mhiramat@...nel.org, namhyung@...nel.org,
        vedang.patel@...el.com, bigeasy@...utronix.de,
        joel.opensrc@...il.com, joelaf@...gle.com,
        mathieu.desnoyers@...icios.com, baohong.liu@...el.com,
        linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org
Subject: Re: [PATCH v2 35/40] tracing: Reverse the order
 event_mutex/trace_types_lock are taken

On Fri, 8 Sep 2017 15:31:35 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:

> On Tue,  5 Sep 2017 16:57:47 -0500
> Tom Zanussi <tom.zanussi@...ux.intel.com> wrote:
> 
> > Change the order event_mutex and trace_types_lock are taken, to avoid
> > circular dependencies and lockdep spew.
> > 
> > Changing the order shouldn't matter to any current code, but does to
> > anything that takes the event_mutex first and then trace_types_lock.
> > This is the case when calling tracing_set_clock from inside an event
> > command, which already holds the event_mutex.  
> 
> This is a very scary patch. I'll apply it and run a bunch of tests with
> lockdep enabled. Let's see what blows up (or not).

Boom!

 ======================================================
 WARNING: possible circular locking dependency detected
 4.13.0-rc7-test+ #84 Not tainted
 ------------------------------------------------------
 mkdir/1674 is trying to acquire lock:
  (event_mutex){+.+.+.}, at: [<ffffffff811b18bd>] event_trace_add_tracer+0x1d/0xb0
 
               but task is already holding lock:
  (trace_types_lock){+.+.+.}, at: [<ffffffff811a121f>] instance_mkdir+0x2f/0x250
 
               which lock already depends on the new lock.

 
               the existing dependency chain (in reverse order) is:
 
               -> #1 (trace_types_lock){+.+.+.}:
        lock_acquire+0xe3/0x1d0
        __mutex_lock+0x81/0x950
        mutex_lock_nested+0x1b/0x20
        trace_module_notify+0x33/0x1b0
        notifier_call_chain+0x4a/0x70
        __blocking_notifier_call_chain+0x4d/0x70
        blocking_notifier_call_chain+0x16/0x20
        load_module+0x21df/0x2dd0
        SYSC_finit_module+0xbc/0xf0
        SyS_finit_module+0xe/0x10
        do_syscall_64+0x62/0x140
        return_from_SYSCALL_64+0x0/0x7a
 
               -> #0 (event_mutex){+.+.+.}:
        __lock_acquire+0x1026/0x11d0
        lock_acquire+0xe3/0x1d0
        __mutex_lock+0x81/0x950
        mutex_lock_nested+0x1b/0x20
        event_trace_add_tracer+0x1d/0xb0
        instance_mkdir+0x173/0x250
        tracefs_syscall_mkdir+0x40/0x70
        vfs_mkdir+0xfb/0x190
        SyS_mkdir+0x6b/0xd0
        entry_SYSCALL_64_fastpath+0x1f/0xbe
 
               other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(trace_types_lock);
                                lock(event_mutex);
                                lock(trace_types_lock);
   lock(event_mutex);
 
                *** DEADLOCK ***

 2 locks held by mkdir/1674:
  #0:  (sb_writers#10){.+.+.+}, at: [<ffffffff812b4824>] mnt_want_write+0x24/0x50
  #1:  (trace_types_lock){+.+.+.}, at: [<ffffffff811a121f>] instance_mkdir+0x2f/0x250
 
               stack backtrace:
 CPU: 3 PID: 1674 Comm: mkdir Not tainted 4.13.0-rc7-test+ #84
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x86/0xcf
  print_circular_bug+0x1be/0x210
  __lock_acquire+0x1026/0x11d0
  lock_acquire+0xe3/0x1d0
  ? lock_acquire+0xe3/0x1d0
  ? event_trace_add_tracer+0x1d/0xb0
  ? event_trace_add_tracer+0x1d/0xb0
  __mutex_lock+0x81/0x950
  ? event_trace_add_tracer+0x1d/0xb0
  ? event_trace_add_tracer+0x1d/0xb0
  ? __create_dir+0xcb/0x130
  mutex_lock_nested+0x1b/0x20
  ? mutex_lock_nested+0x1b/0x20
  event_trace_add_tracer+0x1d/0xb0
  instance_mkdir+0x173/0x250
  tracefs_syscall_mkdir+0x40/0x70
  ? tracefs_syscall_mkdir+0x40/0x70
  vfs_mkdir+0xfb/0x190
  SyS_mkdir+0x6b/0xd0
  entry_SYSCALL_64_fastpath+0x1f/0xbe
 RIP: 0033:0x7f4867afa947
 RSP: 002b:00007ffd3dc35c08 EFLAGS: 00000202 ORIG_RAX: 0000000000000053
 RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f4867afa947
 RDX: 0000000000000000 RSI: 00000000000001ff RDI: 00007ffd3dc3764f
 RBP: ffffc90000f2ff98 R08: 00000000000001ff R09: 000055f96a551ac0
 R10: 000055f96b700010 R11: 0000000000000202 R12: 0000000000000001
 R13: 00007ffd3dc35f28 R14: 00000000000001ff R15: ffffffff8189316a
  ? entry_SYSCALL_64_after_swapgs+0x17/0x4f

It appears to be caused by instance creation. I'll look at that.


-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ