lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Jul 2021 14:49:04 +0800
From:   Shung-Hsi Yu <shung-hsi.yu@...e.com>
To:     Majd Dibbiny <majd@...lanox.com>,
        Maor Gottlieb <maorg@...lanox.com>,
        Saeed Mahameed <saeedm@...dia.com>, netdev@...r.kernel.org,
        linux-rdma@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Leon Romanovsky <leon@...nel.org>
Subject: BUG: double free of mlx5_cmd_work_ent during shutdown

Hi,

When force teardown is used (for devices that doesn't support fast
teardown), mlx5_cmd_force_teardown_hca() sends out a TEARDOWN_HCA command,
wait for the command to complete, and call cmd_ent_put() from the
following stack.

    cmd_exec() {
        // ...
        out_free:
            cmd_ent_put(ent); // HERE
        out:
            return err;
    }
      mlx5_cmd_exec_polling()
        mlx5_cmd_force_teardown_hca()
          shutdown()

However, there's another path which cmd_ent_put() is called for the same
command from the async events chain.

    mlx5_cmd_comp_handler() {
        for (i = 0; i < (1 << cmd->log_sz); i++) {
            if (test_bit(i, &vector)) {
                // ...
                if (!forced || /* Real FW completion */
                    pci_channel_offline(dev->pdev) || /* FW is inaccessible */
                    dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
                    cmd_ent_put(ent); // HERE
        // ...
    }
      cmd_comp_notifier()
        notifier_call_chain()
          atomic_notifier_call_chain()
            mlx5_eq_async_int()
              notifier_call_chain()
                atomic_notifier_call_chain()
                  mlx5_irq_int_handler()

This leads to to "refcount_t: underflow; use-after-free" warning during
shutdown.

---

Here's the stack trace on v5.13-based kernel with KASAN enabled.
(Line number doesn't exactly match though)

    systemd-shutdown[1]: Rebooting with kexec.
    kvm: exiting hardware virtualization
    sd 0:0:0:0: [sda] Synchronizing SCSI cache
    mlx5_core 0000:0a:00.1: Shutdown was called
    mlx5_try_fast_unload:1629: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1629:(pid 4959): force teardown firmware support=1
    mlx5_try_fast_unload:1630: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1630:(pid 4959): fast teardown firmware support=0
    mlx5_cmd_fast_teardown_hca:330: mlx5_core 0000:0a:00.1: mlx5_cmd_fast_teardown_hca:330:(pid 4959): fast teardown is not supported in the firmware
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command TEARDOWN_HCA(0x103) INPUT
    dump_buf:272: cmd[0]: 000: 07000000 00000010 00000000 00000000
    dump_buf:272: cmd[0]: 010: 01030000 00000000 00000001 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 3fc50001
    dump_buf:279: 
    dump_command:848: mlx5_core 0000:0a:00.1: dump_command:848:(pid 4647): cmd[0]: end dump
    cmd_work_handler:1003: mlx5_core 0000:0a:00.1: cmd_work_handler:1003:(pid 4647): writing 0x1 to command doorbell
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    mlx5_core 0000:0a:00.1: mlx5_cmd_comp_handler:1613:(pid 0): Command completion arrived after timeout (entry idx = 0).
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command TEARDOWN_HCA(0x103) OUTPUT

    cmd_ent_put(00000000fbaea4c6) will be called in mlx5_cmd_comp_handler:1615 for TEARDOWN_HCA(0x103), polling 1, ret 0
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    any_notifier:148: mlx5_core 0000:0a:00.1: any_notifier:148:(pid 0): Async eqe type MLX5_EVENT_TYPE_CMD, subtype (0)
    ==================================================================
    BUG: KASAN: use-after-free in dump_command+0x9d9/0xa40 [mlx5_core]
    Read of size 4 at addr ffff8881120620e0 by task kworker/u96:0/4647
    
    CPU: 10 PID: 4647 Comm: kworker/u96:0 Tainted: G            E     5.14.0-rc1-1-default+ #7 6ab45f24446e456129f9451daed46c2a2aa80406
    Hardware name: Dell Inc. PowerEdge T320/0FDT3J, BIOS 2.9.0 01/08/2020
    Workqueue: mlx5_cmd_0000:0a:00.1 cmd_work_handler [mlx5_core]
    Call Trace:
     dump_stack_lvl+0x57/0x72
     print_address_description.constprop.0+0x1f/0x140
     ? dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     kasan_report.cold+0x7f/0x11b
     ? dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     dump_command+0x9d9/0xa40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     mlx5_cmd_comp_handler+0x498/0x13e0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lock_release+0x1ea/0x6c0
     ? mlx5_cmd_check+0x2f0/0x2f0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? seqcount_lockdep_reader_access.constprop.0+0x84/0x90
     ? cmd_work_handler+0xfa2/0x19a0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? __cond_resched+0x15/0x30
     cmd_work_handler+0xfa2/0x19a0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? cmd_comp_notifier+0x90/0x90 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? lock_is_held_type+0xe0/0x110
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     process_one_work+0x8b2/0x14c0
     ? lock_release+0x6c0/0x6c0
     ? pwq_dec_nr_in_flight+0x260/0x260
     ? rwlock_bug.part.0+0x90/0x90
     worker_thread+0x544/0x12d0
     ? get_cpu_entry_area+0x13/0x20
     ? __kthread_parkme+0xa5/0x130
     ? process_one_work+0x14c0/0x14c0
     kthread+0x349/0x410
     ? _raw_spin_unlock_irq+0x24/0x40
     ? set_kthread_struct+0x100/0x100
     ret_from_fork+0x22/0x30
    
    Allocated by task 4959:
     kasan_save_stack+0x1b/0x40
     __kasan_kmalloc+0x7c/0x90
     cmd_exec+0x6f0/0x1740 [mlx5_core]
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core]
     shutdown+0x2e7/0x3b3 [mlx5_core]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     do_syscall_64+0x5c/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    Freed by task 0:
     kasan_save_stack+0x1b/0x40
     kasan_set_track+0x1c/0x30
     kasan_set_free_info+0x20/0x30
     __kasan_slab_free+0xec/0x120
     slab_free_freelist_hook+0x90/0x1b0
     kfree+0xd1/0x410
     mlx5_cmd_comp_handler.cold+0x138/0x16c [mlx5_core]
     cmd_comp_notifier+0x72/0x90 [mlx5_core]
     atomic_notifier_call_chain+0xdd/0x1c0
     mlx5_eq_async_int+0x34c/0x8f0 [mlx5_core]
     atomic_notifier_call_chain+0xdd/0x1c0
     irq_int_handler+0x11/0x20 [mlx5_core]
     __handle_irq_event_percpu+0x264/0x640
     handle_irq_event+0x10b/0x270
     handle_edge_irq+0x21a/0xbe0
     __common_interrupt+0x72/0x160
     common_interrupt+0x7b/0xa0
     asm_common_interrupt+0x1e/0x40
    
    Last potentially related work creation:
     kasan_save_stack+0x1b/0x40
     kasan_record_aux_stack+0xa5/0xb0
     insert_work+0x44/0x280
     __queue_work+0x4b7/0xc50
     queue_work_on+0x8d/0x90
     cmd_exec+0xfac/0x1740 [mlx5_core]
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core]
     shutdown+0x2e7/0x3b3 [mlx5_core]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     do_syscall_64+0x5c/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    The buggy address belongs to the object at ffff888112062000
     which belongs to the cache kmalloc-1k of size 1024
    The buggy address is located 224 bytes inside of
     1024-byte region [ffff888112062000, ffff888112062400)
    The buggy address belongs to the page:
    page:000000008b314a5c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112060
    head:000000008b314a5c order:3 compound_mapcount:0 compound_pincount:0
    flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
    raw: 002ffff800010200 ffffea00041e3a00 0000000500000005 ffff888100042dc0
    raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff888112061f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff888112062000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff888112062080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                           ^
     ffff888112062100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     ffff888112062180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint
    dump_buf:272: cmd[0]: 000: 07000000 00000010 00000000 00000000
    dump_buf:272: cmd[0]: 010: 01030000 00000000 00000001 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000001
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 3fc50000
    dump_buf:279: 
    dump_command:848: mlx5_core 0000:0a:00.1: dump_command:848:(pid 4647): cmd[0]: end dump
    mlx5_cmd_comp_handler:1648: mlx5_core 0000:0a:00.1: mlx5_cmd_comp_handler:1648:(pid 4647): command completed. ret 0x0, delivery status no errors(0x0)
    wait_func:1102: mlx5_core 0000:0a:00.1: wait_func:1102:(pid 4959): err 0, delivery status no errors(0)

    cmd_ent_put(00000000fbaea4c6) will be called in mlx5_cmd_invoke:1178 for TEARDOWN_HCA(0x103), polling 1, ret 0
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    ------------[ cut here ]------------
    refcount_t: underflow; use-after-free.
    WARNING: CPU: 2 PID: 4959 at lib/refcount.c:28 refcount_warn_saturate+0xaa/0x140
    Modules linked in: af_packet(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) mlx5_core(E) intel_rapl_msr(E) intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) mgag200(E) coretemp(E) kvm_intel(E) drm_kms_helper(E) kvm(E) syscopyarea(E) sysfillrect(E) ipmi_ssif(E) sysimgblt(E) joydev(E) fb_sys_fops(E) dcdbas(E) cec(E) tg3(E) nls_iso8859_1(E) iTCO_wdt(E) nls_cp437(E) irqbypass(E) rc_core(E) libphy(E) intel_pmc_bxt(E) mei_me(E) ipmi_si(E) vfat(E) i2c_algo_bit(E) mlxfw(E) efi_pstore(E) fat(E) ipmi_devintf(E) iTCO_vendor_support(E) pcspkr(E) pci_hyperv_intf(E) ipmi_msghandler(E) psample(E) tiny_power_button(E) mei(E) lpc_ich(E) wmi(E) button(E) drm(E) fuse(E) configfs(E) hid_microsoft(E) ff_memless(E) hid_generic(E) usbhid(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) aesni_intel(E) crypto_simd(E) cryptd(E) ehci_pci(E) ehci_hcd(E) usbcore(E) sg(E) efivarfs(E)
    CPU: 2 PID: 4959 Comm: kexec Tainted: G    B       E     5.14.0-rc1-1-default+ #7 6ab45f24446e456129f9451daed46c2a2aa80406
    Hardware name: Dell Inc. PowerEdge T320/0FDT3J, BIOS 2.9.0 01/08/2020
    RIP: 0010:refcount_warn_saturate+0xaa/0x140
    Code: e9 0c b5 02 01 e8 be 45 f9 00 0f 0b eb d5 80 3d d7 0c b5 02 00 75 cc 48 c7 c7 e0 f0 a4 ad c6 05 c7 0c b5 02 01 e8 9e 45 f9 00 <0f> 0b eb b5 80 3d b5 0c b5 02 00 75 ac 48 c7 c7 a0 f1 a4 ad c6 05
    RSP: 0018:ffff8881593b79d8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff888112062214 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffed102b276f2d
    RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffed102b276eca R11: 0000000000000001 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff888159585908 R15: ffff8881079fa080
    FS:  00007f8846e63740(0000) GS:ffff8881d3700000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f884700b670 CR3: 0000000112ef0004 CR4: 00000000001706e0
    Call Trace:
     cmd_exec+0xda6/0x1740 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? cb_timeout_handler+0xa0/0xa0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? mark_held_locks+0x9e/0xe0
     mlx5_cmd_exec_polling+0x1f/0x40 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     mlx5_cmd_force_teardown_hca+0xfc/0x200 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     ? mlx5_cmd_teardown_hca+0xd0/0xd0 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     shutdown+0x2e7/0x3b3 [mlx5_core 159b773d3cc25877247f038573d06bc052f8250f]
     pci_device_shutdown+0x75/0x110
     device_shutdown+0x2ce/0x5e0
     kernel_kexec+0x57/0xb0
     __do_sys_reboot+0x284/0x2f0
     ? poweroff_work_func+0x90/0x90
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? kasan_quarantine_put+0x8e/0x1e0
     ? lockdep_hardirqs_on+0x77/0xf0
     ? kasan_quarantine_put+0x8e/0x1e0
     ? lock_is_held_type+0xe0/0x110
     ? slab_free_freelist_hook+0x90/0x1b0
     ? __fput+0x2a1/0x780
     ? lock_is_held_type+0x53/0x110
     ? rcu_read_lock_sched_held+0x3f/0x80
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? call_rcu+0x5b2/0xa80
     ? lock_is_held_type+0x53/0x110
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? syscall_enter_from_user_mode+0x21/0x70
     do_syscall_64+0x5c/0xc0
     ? lockdep_hardirqs_on_prepare+0x278/0x3e0
     ? lockdep_hardirqs_on+0x77/0xf0
     ? do_syscall_64+0x69/0xc0
     ? do_syscall_64+0x69/0xc0
     ? lock_is_held_type+0x53/0x110
     ? irqentry_exit_to_user_mode+0xa/0x40
     ? asm_exc_page_fault+0x8/0x30
     ? lockdep_hardirqs_on+0x77/0xf0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f8846faa0d7
    Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 69 ad 0c 00 f7 d8 64 89 02 b8
    RSP: 002b:00007fffeb69d648 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
    RAX: ffffffffffffffda RBX: 0000561e00f46520 RCX: 00007f8846faa0d7
    RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead
    RBP: 00007f8847076600 R08: 000000000000000b R09: 0000000000000000
    R10: 00007f8846ebe0d8 R11: 0000000000000206 R12: 0000561dffc738fe
    R13: 0000561dffc73905 R14: 00007fffeb69d7a0 R15: 00000000ffffffff
    irq event stamp: 11228
    hardirqs last  enabled at (11227): [<ffffffffad2dde14>] _raw_spin_unlock_irq+0x24/0x40
    hardirqs last disabled at (11228): [<ffffffffad2c9f39>] __schedule+0x1229/0x2300
    softirqs last  enabled at (11162): [<ffffffffab3bd94e>] __irq_exit_rcu+0x12e/0x170
    softirqs last disabled at (11157): [<ffffffffab3bd94e>] __irq_exit_rcu+0x12e/0x170
    ---[ end trace 151c2ed42dd09eaf ]---
    cmd_exec:1864: mlx5_core 0000:0a:00.1: cmd_exec:1864:(pid 4959): err 0, status 0
    mlx5_core 0000:0a:00.1: mlx5_cmd_force_teardown_hca:313:(pid 4959): teardown with force mode failed, doing normal teardown
    mlx5_try_fast_unload:1654: mlx5_core 0000:0a:00.1: mlx5_try_fast_unload:1654:(pid 4959): Firmware couldn't do fast unload error: -5
    mlx5_wait_for_pages:739: mlx5_core 0000:0a:00.1: mlx5_wait_for_pages:739:(pid 4959): Waiting for 0 pages
    mlx5_wait_for_pages:752: mlx5_core 0000:0a:00.1: mlx5_wait_for_pages:752:(pid 4959): All pages received
    dump_command:805: mlx5_core 0000:0a:00.1: dump_command:805:(pid 4647): cmd[0]: start dump
    dump_command:814: mlx5_core 0000:0a:00.1: dump_command:814:(pid 4647): cmd[0]: dump command DESTROY_FLOW_TABLE(0x931) INPUT
    dump_buf:272: cmd[0]: 000: 07000000 00000040 00000001 347e4400
    dump_buf:272: cmd[0]: 010: 09310000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 020: 00000000 00000000 00000000 00000000
    dump_buf:272: cmd[0]: 030: 00000000 00000000 00000010 40de0001


Best,
Shung-Hsi Yu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ