linux-kernel - Re: [PATCH] scsi: be2iscsi: Fix a theoretical leak in beiscsi_create

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54f36c62-10bf-8736-39ce-27ece097d9de@proxmox.com>
Date:   Thu, 3 Dec 2020 11:10:09 +0100
From:   Thomas Lamprecht <t.lamprecht@...xmox.com>
To:     dan.carpenter@...cle.com
Cc:     James.Bottomley@...e.de,
        jayamohank@...edirect-LB5-1afb6e2973825a56.elb.us-east-1.amazonaws.com,
        jejb@...ux.ibm.com, jitendra.bhivare@...adcom.com,
        kernel-janitors@...r.kernel.org, ketan.mukadam@...adcom.com,
        linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
        martin.petersen@...cle.com, subbu.seetharaman@...adcom.com,
        stable@...r.kernel.org
Subject: Re: [PATCH] scsi: be2iscsi: Fix a theoretical leak in
 beiscsi_create_eqs()

> The be_fill_queue() function can only fail when "eq_vaddress" is NULL
> and since it's non-NULL here that means the function call can't fail.
> But imagine if it could, then in that situation we would want to store
> the "paddr" so that dma memory can be released.
> 
> Fixes: bfead3b2cb46 ("[SCSI] be2iscsi: Adding msix and mcc_rings V3")
> Signed-off-by: Dan Carpenter <dan.carpenter@...cle.com>

This came in here through the stable 5.4 tree with v5.4.74, and we have some
users of ours report that it results in kernel oopses and delayed boot on their
HP DL 380 Gen 9 (and other Gen 9, FWICT) servers:

> systemd-udevd   D    0   501      1 0x80000000
> Call Trace:
>  __schedule+0x2e6/0x6f0
>  schedule+0x33/0xa0
>  schedule_timeout+0x205/0x330
>  wait_for_completion+0xb7/0x140
>  ? wake_up_q+0x80/0x80
>  __flush_work+0x131/0x1e0
>  ? worker_detach_from_pool+0xb0/0xb0
>  work_on_cpu+0x6d/0x90
>  ? workqueue_congested+0x80/0x80
>  ? pci_device_shutdown+0x60/0x60
>  pci_device_probe+0x190/0x1b0
>  really_probe+0x1c8/0x3e0
>  driver_probe_device+0xbb/0x100
>  device_driver_attach+0x58/0x60
>  __driver_attach+0x8f/0x150
>  ? device_driver_attach+0x60/0x60
>  bus_for_each_dev+0x79/0xc0
>  ? kmem_cache_alloc_trace+0x1a0/0x230
>  driver_attach+0x1e/0x20
>  bus_add_driver+0x154/0x1f0
>  ? 0xffffffffc0453000
>  driver_register+0x70/0xc0
>  ? 0xffffffffc0453000
>  __pci_register_driver+0x57/0x60
>  beiscsi_module_init+0x62/0x1000 [be2iscsi]
>  do_one_initcall+0x4a/0x1fa
>  ? _cond_resched+0x19/0x30
>  ? kmem_cache_alloc_trace+0x1a0/0x230
>  do_init_module+0x60/0x230
>  load_module+0x231b/0x2590
>  __do_sys_finit_module+0xbd/0x120
>  ? __do_sys_finit_module+0xbd/0x120
>  __x64_sys_finit_module+0x1a/0x20
>  do_syscall_64+0x57/0x190
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f00aca06f59
> Code: Bad RIP value.
> RSP: 002b:00007ffc14380858 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 0000558c726262e0 RCX: 00007f00aca06f59
> RDX: 0000000000000000 RSI: 00007f00ac90bcad RDI: 000000000000000e
> RBP: 00007f00ac90bcad R08: 0000000000000000 R09: 0000000000000000
> R10: 000000000000000e R11: 0000000000000246 R12: 0000000000000000
> R13: 0000558c725f6030 R14: 0000000000020000 R15: 0000558c726262e0

Blacklisting the be2iscsi module or reverting this commit helps, I did not get
around to look further into the mechanics at play and figured you would be
faster at that, or that this info at least helps someone else when searching
for the same symptoms.

cheers,
Thomas