lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 7 Jul 2018 03:41:30 +0200
From:   Tomas Bortoli <tomasbortoli@...il.com>
To:     dledford@...hat.com, jgg@...pe.ca
Cc:     leon@...nel.org, parav@...lanox.com, roland@...estorage.com,
        swise@...ngridcomputing.com, linux-rdma@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller@...glegroups.com
Subject: [PATCH] KASAN: use-after-free Read in rdma_listen

Hi,

I spent some time debugging the Syzkaller's found issue at subject:

https://syzkaller.appspot.com/bug?id=b8febdb3c7c8c1f1b606fb903cee66b21b2fd02f

And I've backtracked the UAF to the fact that the cma_listen_on_all()
function adds "id_priv->list" to the global var "listen_any_list" but
then such element is not removed in the rdma_destroy_id() function
(though I've seen that the call to cma_release_dev() in
rdma_destroy_id() should do the removal but doesn't get executed).

Therefore, if a program allocates a "struct rdma_cm_id" (through
ucma_open + ucma_create_id), then executes cma_listen_on_all(), then
frees the struct and repeat, during the second execution of
cma_listen_on_all() the kernel will try to update the references of the
freed node, triggering the UAF. I was able to fix the UAF with this ugly
patch:

--- b/drivers/infiniband/core/cma.c    2018-07-07 02:28:03.214589868 +0200
+++ a/drivers/infiniband/core/cma.c    2018-07-07 03:35:44.325301216 +0200
@@ -1678,6 +1678,11 @@ void rdma_destroy_id(struct rdma_cm_id *
     mutex_lock(&id_priv->handler_mutex);
     mutex_unlock(&id_priv->handler_mutex);
 
+    mutex_lock(&lock);
+    if(id_priv->list.next!=0 && id_priv->list.prev!=0)
+        list_del(&id_priv->list);
+    mutex_unlock(&lock);
+
     if (id_priv->cma_dev) {
         rdma_restrack_del(&id_priv->res);
         if (rdma_cap_ib_cm(id_priv->id.device, 1)) {

Note: I only tested this patch against the shortest reproducer for this
issue (not any other use of rdma_cm):

https://syzkaller.appspot.com/text?tag=ReproC&x=1334f10f800000

I had to add that "if" in the patch because running the reproducer
(after several iterations) provoked a NULL-dereference in the added
list_del() call because for some reason I haven't cleared yet the next
and prev pointers of the list at issue gets zeroed, sometimes ( by what ??).


Moreover, I noticed that running the reproducer for "long" time exhaust
all the available memory. To spot the memory leaks I recompiled with:

CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=10000

The reproducer induces, apparently, 2 memory leaks reported by kmemleak:

unreferenced object 0xffff880069f49d40 (size 512):
  comm "repro", pid 4263, jiffies 4294722196 (age 688.262s)
  hex dump (first 32 bytes):
    00 b8 13 5a 00 88 ff ff 40 9d f4 69 00 88 ff ff  ...Z....@.......
    0a 00 98 a6 00 00 00 00 fe 80 00 00 00 00 00 00  ................
  backtrace:
    [<0000000075a2f334>] kmem_cache_alloc_trace+0x1b2/0x3d0
    [<0000000075fd9fea>] rdma_resolve_ip+0xc0/0x6b0
    [<0000000033592b0b>] rdma_resolve_addr+0x490/0x2580
    [<00000000d6f2cd9d>] ucma_resolve_ip+0x193/0x260
    [<0000000068f1c2b7>] ucma_write+0x2ec/0x3f0
    [<00000000015692cc>] __vfs_write+0x107/0x920
    [<000000009528b010>] vfs_write+0x189/0x510
    [<000000001a5d169b>] ksys_write+0xfa/0x240
    [<00000000b747746a>] __x64_sys_write+0x73/0xb0
    [<0000000071590ffb>] do_syscall_64+0x18c/0x760
    [<000000003c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<0000000059247e9d>] 0xffffffffffffffff


unreferenced object 0xffff88006c0c0bc0 (size 576):
  comm "repro", pid 4261, jiffies 4294722191 (age 688.261s)
  hex dump (first 32 bytes):
    00 02 00 00 00 00 00 00 80 b8 07 6c 00 88 ff ff  ...........l....
    b0 7d 2c 6b 00 88 ff ff d8 0b 0c 6c 00 88 ff ff  .},k.......l....
  backtrace:
    [<0000000039511ef2>] kmem_cache_alloc+0x1b2/0x3d0
    [<00000000106bf668>] radix_tree_node_alloc.constprop.18+0x5e/0x2e0
    [<000000005b2f026d>] idr_get_free+0x9f5/0x1000
    [<00000000445baa5a>] idr_alloc_u32+0x1bc/0x3d0
    [<000000007fd1b6f4>] idr_alloc+0xfd/0x190
    [<00000000d706389e>] cma_alloc_port+0xb0/0x170
    [<000000008f968f9e>] rdma_bind_addr+0x1252/0x1f00
    [<00000000e3361215>] rdma_resolve_addr+0x39e/0x2580
    [<00000000d6f2cd9d>] ucma_resolve_ip+0x193/0x260
    [<0000000068f1c2b7>] ucma_write+0x2ec/0x3f0
    [<00000000015692cc>] __vfs_write+0x107/0x920
    [<000000009528b010>] vfs_write+0x189/0x510
    [<000000001a5d169b>] ksys_write+0xfa/0x240
    [<00000000b747746a>] __x64_sys_write+0x73/0xb0
    [<0000000071590ffb>] do_syscall_64+0x18c/0x760
    [<000000003c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

I don't have a background on usage or internals of the driver at issue
but I hope these clues will help in finding the proper fix.

Tomas


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ