linux-kernel - Re: [PATCH] lock/lockdep: Add missing graph_unlock in validate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAB8ipk-E63myUvW7X=kX7t0_zW9wyBOcseyyh_tjOJNpzz=fZQ@mail.gmail.com>
Date: Tue, 9 Jan 2024 10:55:11 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, peterz@...radead.org, mingo@...hat.com, 
	will@...nel.org, longman@...hat.com, ke.wang@...soc.com, 
	zhiguo.niu@...soc.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] lock/lockdep: Add missing graph_unlock in validate_chain

Hi boqun

On Tue, Jan 9, 2024 at 12:30 AM Boqun Feng <boqun.feng@...il.com> wrote:
>
> On Fri, Jan 05, 2024 at 12:46:36PM +0800, Xuewen Yan wrote:
> [...]
> > >
> > > Are you hitting a real issue or this is found by code reading?
> >
> > Indeed, we hit a real issue:
> > One cpu did not call graph_unlock, as a result, caused a deadlock with
> > other cpus,
> > because any cpu calling raw_spin_lock would get the graph_lock first.
> >
>
> Could you share more details about the real issue you hit? For example,
> serial log? I asked because although the graph_unlock() makes logical
> sense, but that path should really not hit if lockdep works correctly.

The following is our stack:

first, there is a  scenario in our kernel tree:
get mutexA lock-->get rcu lock-->get mutexB lock
As a result, there is a warning about the chain:

[    7.344848][ T1@C0] reboot: Restarting system with command
'userrequested,recovery'
[    7.354028][ T1@C0]
[    7.354358][ T1@C0] =============================
[    7.354967][ T1@C0] [ BUG: Invalid wait context ]
[    7.355580][ T1@C0] 6.6.5-0-g53d4f6ea5e56-dirty-ab000013 #1
Tainted: G        W
[    7.356720][ T1@C0] -----------------------------
[    7.357326][ T1@C0] init/1 is trying to lock:
[    7.357894][ T1@C0] ffffffc07b0b5310 (kmsg_buf_lock){+.+.}-{3:3},
at: last_kmsg_handler+0x60/0xb8
[    7.359244][ T1@C0] other info that might help us debug this:
[    7.359982][ T1@C0] context-{4:4}
[    7.360420][ T1@C0] 2 locks held by init/1:
[    7.360967][ T1@C0]  #0: ffffffc08234d0b8
(system_transition_mutex){+.+.}-{3:3}, at:
__arm64_sys_reboot+0x130/0x27c
[    7.362353][ T1@C0]  #1: ffffffc0823c9a18
(rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x38
[    7.363568][ T1@C0] stack backtrace:
[    7.364038][ T1@C0] CPU: 0 PID: 1 Comm: init Tainted: G        W
      6.6.5-android14-0-g53d4f6ea5e56-dirty-ab000013 #1
[    7.365453][ T1@C0] Hardware name: Unisoc ***-base Board (DT)
[    7.366238][ T1@C0] Call trace:
[    7.366652][ T1@C0]  dump_backtrace+0xf8/0x148
[    7.367244][ T1@C0]  show_stack+0x20/0x30
[    7.367779][ T1@C0]  dump_stack_lvl+0x60/0x84
[    7.368372][ T1@C0]  __lock_acquire+0xc2c/0x3288
[    7.368982][ T1@C0]  lock_acquire+0x124/0x2b0
[    7.369556][ T1@C0]  __mutex_lock+0xa0/0xbfc
[    7.370122][ T1@C0]  mutex_lock_nested+0x2c/0x38
[    7.370730][ T1@C0]  last_kmsg_handler+0x60/0xb8 <<<<get mutex B
[    7.371494][ T1@C0]  kmsg_dump+0xf0/0x16c   <<<<rcu lock
[    7.372028][ T1@C0]  kernel_restart+0x100/0x11c
[    7.372626][ T1@C0]  __arm64_sys_reboot+0x1a8/0x27c
[    7.373270][ T1@C0]  invoke_syscall+0x60/0x11c
[    7.373857][ T1@C0]  el0_svc_common+0xb4/0xf0
[    7.374434][ T1@C0]  do_el0_svc+0x24/0x30
[    7.374967][ T1@C0]  el0_svc+0x50/0xe4
[    7.375467][ T1@C0]  el0t_64_sync_handler+0x68/0xbc
[    7.376109][ T1@C0]  el0t_64_sync+0x1a8/0x1ac

This warning should be caused by the order of taking the lock.
But when CONFIG_PREEMPT_RCU is turned on, I think it is reasonable to
take the mutex after taking the rcu-lock.
Maybe the logic of check_wait_context needs to be modified？

And then it occurs the following stack:

core0 get rdp_gp->nocb_gp_lock and then try to get graph_lock:

-000|queued_spin_lock_slowpath(lock = 0xFFFFFFC08291BBC8, val = 0)
-001|queued_spin_lock()
-001|lockdep_lock()
-001|graph_lock()
-002|lookup_chain_cache_add()
-002|validate_chain()
-003|lock_acquire(lock = 0xFFFFFF817F211D98, subclass = 0, trylock =
0, read = 0, check = 1, nest_lock = 0x0, ip = 18446743800981533176)
-004|__raw_spin_lock_irqsave()
    |  lock = 0xFFFFFF817F211D80 -> (
    |    raw_lock = (
    |      val = (counter = 0),
    |      locked = 0,
    |      pending = 0,
    |      locked_pending = 0,
    |      tail = 0),
    |    magic = 3735899821,
    |    owner_cpu = 4294967295,
    |    owner = 0xFFFFFFFFFFFFFFFF,
    |    dep_map = (key = 0xFFFFFFC082F12C88, class_cache =
(0xFFFFFFC0825378E0, 0x0), name = 0xFFFFFFC081541039, wait_type_outer
= 0, wait_type_inner = 2, lock_type = 0))
    |  flags = 0
    |
 110|
-004|_raw_spin_lock_irqsave(lock = 0xFFFFFF817F211D80)
-005|lock_timer_base(inline)
-005|__mod_timer.llvm.7968396489078322347(timer = 0xFFFFFF817F2A88D0,
expires = 4294893461, options = 0)
-006|mod_timer(inline)
-006|wake_nocb_gp_defer(inline)
-006|__call_rcu_nocb_wake(rdp = 0xFFFFFF817F2A8680, was_alldone = ?, flags = ?)
-007|__call_rcu_common(inline)
-007|call_rcu(head = 0xFFFFFFC0822E0B58, func = ?)
-008|call_rcu_hurry(inline)
-008|rcu_sync_call(inline)
-008|rcu_sync_func(rhp = 0xFFFFFFC0822E0B58)
-009|rcu_do_batch(rdp = 0xFFFFFF817F266680)
-010|nocb_cb_wait(inline)
-010|rcu_nocb_cb_kthread(arg = 0xFFFFFF817F266680)
-011|kthread(_create = 0xFFFFFF8080363740)
-012|ret_from_fork(asm)

However, the grapg_lock is owned by core1, and it try to get
rdp_gp->nocb_gp_lock, and it caused the deadlock.
But we do not see where the core do not unlock the graph_lock:

-000|queued_spin_lock_slowpath(lock = 0xFFFFFF817F2A8A80, val = ?)
-001|queued_spin_lock(inline)
-001|do_raw_spin_lock(lock = 0xFFFFFF817F2A8A80)
-002|__raw_spin_lock_irqsave(inline)
-002|_raw_spin_lock_irqsave(lock = 0xFFFFFF817F2A8A80)
-003|wake_nocb_gp_defer(inline)
-003|__call_rcu_nocb_wake(rdp = 0xFFFFFF817F30B680, was_alldone = ?, flags = ?)
-004|__call_rcu_common(inline)
-004|call_rcu(head = 0xFFFFFFC082EECC28, func = ?)
-005|call_rcu_zapped(inline)
-005|free_zapped_rcu(ch = ?)
-006|rcu_do_batch(rdp = 0xFFFFFF817F245680)
-007|nocb_cb_wait(inline)
-007|rcu_nocb_cb_kthread(arg = 0xFFFFFF817F245680)
-008|kthread(_create = 0xFFFFFF80803122C0)
-009|ret_from_fork(asm)

Based on this, we read the lockdep code, and found the check_prev_add's logic.

But now we reproduce the scenario although we add the graph_unlock in
check_prev_add().
So this is not the root-cause of our problem. And we are also still debugging.
We would be very grateful if you could give us some suggestions.

Thanks!

---
BRs
xuewen


> Thanks!
>
> Regards,
> Boqun
>
> > Thanks!