lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <e945621d-2eff-0621-7508-f39ce3412135@candelatech.com>
Date: Wed, 9 Oct 2024 11:10:58 -0700
From: Ben Greear <greearb@...delatech.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: boqun.feng@...il.com
Subject: lockdep: Not able to find what holds lock.

Hello,

I'm debugging a lockup related to wifi, and have some questions on lockdep.

Part of the output looks like this.  From what I can tell, this first kworker
holds rtnl, and is trying to acquire the wiphy.mtx and is blocked there.

Showing all locks held in the system:
4 locks held by kworker/0:1/10:
  #0: ffff888110061548 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0xf16/0x1750
  #1: ffff888112bbfd80 (reg_work){+.+.}-{0:0}, at: process_one_work+0x7d6/0x1750
  #2: ffffffff856a2ee8 (rtnl_mutex){+.+.}-{4:4}, at: reg_todo+0x13/0x770 [cfg80211]
  #3: ffff888144150768 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: reg_process_self_managed_hints+0x6b/0x180 [cfg80211]
3 locks held by kworker/u32:0/11:
  #0: ffff888110071948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0xf16/0x1750
  #1: ffff888112bcfd80 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work+0x7d6/0x1750
  #2: ffffffff856a2ee8 (rtnl_mutex){+.+.}-{4:4}, at: linkwatch_event+0x5/0x50
4 locks held by kworker/u32:1/66:
1 lock held by khungtaskd/67:
...

Nothing else in the printout shows any wiphy.mtx held, but there are some processes
skipped, evidently because they are running but not the current task.  Is there any real
harm in adding a patch like this to maybe show what actually holds the lock in question?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3468d8230e5f..fc7c1c7e0d8f 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -781,6 +781,7 @@ static void print_lock(struct held_lock *hlock)
  static void lockdep_print_held_locks(struct task_struct *p)
  {
         int i, depth = READ_ONCE(p->lockdep_depth);
+       const char* rnc = "";

         if (!depth)
                 printk("no locks held by %s/%d.\n", p->comm, task_pid_nr(p));
@@ -792,9 +793,10 @@ static void lockdep_print_held_locks(struct task_struct *p)
          * and it's not the current task.
          */
         if (p != current && task_is_running(p))
-               return;
+               rnc = " Not reliable: Running-Not-Current: ";
+
         for (i = 0; i < depth; i++) {
-               printk(" #%d: ", i);
+               printk(" %s#%d: ", rnc, i);
                 print_lock(p->held_locks + i);
         }
  }


And, when a process is printing out its 'held' locks, it seems it is also printing out the one
it wants to be holding but is blocked acquiring.  Is there a way to have lockdep print
the process information (preferably including backtrace) for the process that actually
owns the lock currently?

Thanks,
Ben

-- 
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc  http://www.candelatech.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ