linux-kernel - WARNING: ODEBUG bug in process_one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <94eb2c0bc3aa8b2432056623c735@google.com>
Date:   Mon, 26 Feb 2018 12:59:01 -0800
From:   syzbot <syzbot+3b4acab09b6463472d0a@...kaller.appspotmail.com>
To:     danielj@...lanox.com, dledford@...hat.com, jgg@...pe.ca,
        johannes.berg@...el.com, leonro@...lanox.com,
        linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
        monis@...lanox.com, pabeni@...hat.com, parav@...lanox.com,
        roland@...estorage.com, syzkaller-bugs@...glegroups.com,
        yuval.shaia@...cle.com
Subject: WARNING: ODEBUG bug in process_one_req

Hello,

syzbot hit the following crash on upstream commit
af3e79d29555b97dd096e2f8e36a0f50213808a8 (Tue Feb 20 18:05:02 2018 +0000)
Merge tag 'leds_for-4.16-rc3' of  
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds

So far this crash happened 338 times on bpf-next, upstream.
C reproducer is attached.
syzkaller reproducer is attached.
Raw console output is attached.
compiler: gcc (GCC) 7.1.1 20170620
.config is attached.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3b4acab09b6463472d0a@...kaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

------------[ cut here ]------------
ODEBUG: free active (active state 0) object type: work_struct hint:  
process_one_req+0x0/0x6c0 include/net/dst.h:165
WARNING: CPU: 0 PID: 21 at lib/debugobjects.c:291  
debug_print_object+0x166/0x220 lib/debugobjects.c:288
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 4.16.0-rc2+ #324
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x24d lib/dump_stack.c:53
  panic+0x1e4/0x41c kernel/panic.c:183
  __warn+0x1dc/0x200 kernel/panic.c:547
  report_bug+0x211/0x2d0 lib/bug.c:184
  fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
  fixup_bug arch/x86/kernel/traps.c:247 [inline]
  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
  invalid_op+0x58/0x80 arch/x86/entry/entry_64.S:957
RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
RSP: 0018:ffff8801d9447250 EFLAGS: 00010086
RAX: dffffc0000000008 RBX: 0000000000000003 RCX: ffffffff815abdbe
RDX: 0000000000000000 RSI: 1ffff1003b288dfa RDI: 1ffff1003b288dcf
RBP: ffff8801d9447290 R08: 0000000000000000 R09: 1ffff1003b288da1
R10: ffffed003b288e79 R11: ffffffff86f394b8 R12: 0000000000000001
R13: ffffffff86f14d80 R14: ffffffff86407de0 R15: ffffffff8147ac00
  __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
  debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
  kfree+0xc7/0x260 mm/slab.c:3799
  process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:597
  process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:407

======================================================
WARNING: possible circular locking dependency detected
4.16.0-rc2+ #324 Not tainted
------------------------------------------------------
kworker/u4:1/21 is trying to acquire lock:
  ((console_sem).lock){..-.}, at: [<00000000a1a1c3f2>]  
down_trylock+0x13/0x70 kernel/locking/semaphore.c:136

but task is already holding lock:
  (&obj_hash[i].lock){-.-.}, at: [<00000000ab5eb9f0>]  
__debug_check_no_obj_freed lib/debugobjects.c:736 [inline]
  (&obj_hash[i].lock){-.-.}, at: [<00000000ab5eb9f0>]  
debug_check_no_obj_freed+0x1e9/0xf1f lib/debugobjects.c:774

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&obj_hash[i].lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
        __debug_object_init+0x109/0x1040 lib/debugobjects.c:343
        debug_object_init+0x17/0x20 lib/debugobjects.c:391
        debug_hrtimer_init kernel/time/hrtimer.c:410 [inline]
        debug_init kernel/time/hrtimer.c:458 [inline]
        hrtimer_init+0x8c/0x410 kernel/time/hrtimer.c:1259
        init_dl_task_timer+0x1b/0x50 kernel/sched/deadline.c:1060
        __sched_fork+0x2bb/0xb60 kernel/sched/core.c:2189
        init_idle+0x75/0x820 kernel/sched/core.c:5352
        sched_init+0xb19/0xc43 kernel/sched/core.c:6049
        start_kernel+0x452/0x819 init/main.c:585
        x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
        x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
        secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:237

-> #2 (&rq->lock){-.-.}:
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144
        rq_lock kernel/sched/sched.h:1760 [inline]
        task_fork_fair+0x7a/0x690 kernel/sched/fair.c:9471
        sched_fork+0x450/0xc10 kernel/sched/core.c:2405
        copy_process.part.37+0x1758/0x4b60 kernel/fork.c:1774
        copy_process kernel/fork.c:1617 [inline]
        _do_fork+0x1f7/0xf70 kernel/fork.c:2098
        kernel_thread+0x34/0x40 kernel/fork.c:2157
        rest_init+0x22/0xf0 init/main.c:402
        start_kernel+0x7f1/0x819 init/main.c:716
        x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:378
        x86_64_start_kernel+0x77/0x7a arch/x86/kernel/head64.c:359
        secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:237

-> #1 (&p->pi_lock){-.-.}:
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
        try_to_wake_up+0xbc/0x15f0 kernel/sched/core.c:1989
        wake_up_process+0x10/0x20 kernel/sched/core.c:2152
        __up.isra.0+0x1cc/0x2c0 kernel/locking/semaphore.c:262
        up+0x13b/0x1d0 kernel/locking/semaphore.c:187
        __up_console_sem+0xb2/0x1a0 kernel/printk/printk.c:242
        console_unlock+0x5af/0xfb0 kernel/printk/printk.c:2417
        vprintk_emit+0x5c3/0xb90 kernel/printk/printk.c:1907
        vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
        vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379
        printk+0xaa/0xca kernel/printk/printk.c:1980
        kauditd_printk_skb kernel/audit.c:506 [inline]
        kauditd_hold_skb+0x163/0x180 kernel/audit.c:539
        kauditd_send_queue+0xfa/0x140 kernel/audit.c:702
        kauditd_thread+0x660/0x940 kernel/audit.c:828
        kthread+0x33c/0x400 kernel/kthread.c:238
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:407

-> #0 ((console_sem).lock){..-.}:
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
        down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
        __down_trylock_console_sem+0xa2/0x1e0 kernel/printk/printk.c:225
        console_trylock+0x15/0x70 kernel/printk/printk.c:2229
        console_trylock_spinning kernel/printk/printk.c:1643 [inline]
        vprintk_emit+0x5b5/0xb90 kernel/printk/printk.c:1906
        vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
        vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379
        printk+0xaa/0xca kernel/printk/printk.c:1980
        __warn_printk+0x90/0xf0 kernel/panic.c:599
        debug_print_object+0x166/0x220 lib/debugobjects.c:288
        __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
        debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
        kfree+0xc7/0x260 mm/slab.c:3799
        process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:597
        process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
        worker_thread+0x223/0x1990 kernel/workqueue.c:2247
        kthread+0x33c/0x400 kernel/kthread.c:238
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:407

other info that might help us debug this:

Chain exists of:
   (console_sem).lock --> &rq->lock --> &obj_hash[i].lock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&obj_hash[i].lock);
                                lock(&rq->lock);
                                lock(&obj_hash[i].lock);
   lock((console_sem).lock);

  *** DEADLOCK ***

3 locks held by kworker/u4:1/21:
  #0:  ((wq_completion)"ib_addr"){+.+.}, at: [<00000000cabe6c6c>]  
process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084
  #1:  ((work_completion)(&(&req->work)->work)){+.+.}, at:  
[<000000001f3d791c>] process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088
  #2:  (&obj_hash[i].lock){-.-.}, at: [<00000000ab5eb9f0>]  
__debug_check_no_obj_freed lib/debugobjects.c:736 [inline]
  #2:  (&obj_hash[i].lock){-.-.}, at: [<00000000ab5eb9f0>]  
debug_check_no_obj_freed+0x1e9/0xf1f lib/debugobjects.c:774

stack backtrace:
CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 4.16.0-rc2+ #324
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x24d lib/dump_stack.c:53
  print_circular_bug.isra.38+0x2cd/0x2dc kernel/locking/lockdep.c:1223
  check_prev_add kernel/locking/lockdep.c:1863 [inline]
  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
  validate_chain kernel/locking/lockdep.c:2417 [inline]
  __lock_acquire+0x30a8/0x3e00 kernel/locking/lockdep.c:3431
  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
  __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
  _raw_spin_lock_irqsave+0x96/0xc0 kernel/locking/spinlock.c:152
  down_trylock+0x13/0x70 kernel/locking/semaphore.c:136
  __down_trylock_console_sem+0xa2/0x1e0 kernel/printk/printk.c:225
  console_trylock+0x15/0x70 kernel/printk/printk.c:2229
  console_trylock_spinning kernel/printk/printk.c:1643 [inline]
  vprintk_emit+0x5b5/0xb90 kernel/printk/printk.c:1906
  vprintk_default+0x28/0x30 kernel/printk/printk.c:1947
  vprintk_func+0x57/0xc0 kernel/printk/printk_safe.c:379
  printk+0xaa/0xca kernel/printk/printk.c:1980
  __warn_printk+0x90/0xf0 kernel/panic.c:599
  debug_print_object+0x166/0x220 lib/debugobjects.c:288
  __debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
  debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
  kfree+0xc7/0x260 mm/slab.c:3799
  process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:597
  process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:407
Shutting down cpus with NMI
Dumping ftrace buffer:
    (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@...glegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

View attachment "raw.log.txt" of type "text/plain" (28285 bytes)

View attachment "repro.syz.txt" of type "text/plain" (797 bytes)

View attachment "repro.c.txt" of type "text/plain" (26728 bytes)

View attachment "config.txt" of type "text/plain" (137429 bytes)