[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZJKZT3LHBN3zEUd1@shredder>
Date: Wed, 21 Jun 2023 09:31:43 +0300
From: Ido Schimmel <idosch@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
davem@...emloft.net, pabeni@...hat.com, edumazet@...gle.com,
petrm@...dia.com
Subject: Re: [RFC PATCH net-next 1/2] devlink: Hold a reference on parent
device
On Tue, Jun 20, 2023 at 10:43:51AM -0700, Jakub Kicinski wrote:
> Do we need to hold the reference on the device until release?
> I think you can release it in devlink_free().
> The only valid fields for an unregistered devlink instance are:
> - lock
> - refcount
> - index
>
> And obviously unregistered devices can't be reloaded.
Thanks for taking a look.
Moving the release to devlink_free() [1] was the first thing I tried and
it indeed solves the problem I mentioned earlier, but creates a new one.
After devlink_free() returns the devlink instance can still be accessed
by user space in devlink_get_from_attrs_lock(). If I reload in a loop
while concurrently removing and adding the device [2], we can hit a UAF
when trying to acquire the device lock [3].
[1]
diff --git a/net/devlink/core.c b/net/devlink/core.c
index a4b6d548e50c..b60a8463d6e0 100644
--- a/net/devlink/core.c
+++ b/net/devlink/core.c
@@ -92,7 +92,6 @@ static void devlink_release(struct work_struct *work)
mutex_destroy(&devlink->lock);
lockdep_unregister_key(&devlink->lock_key);
- put_device(devlink->dev);
kfree(devlink);
}
@@ -264,6 +263,8 @@ void devlink_free(struct devlink *devlink)
xa_erase(&devlinks, devlink->index);
+ put_device(devlink->dev);
+
devlink_put(devlink);
}
EXPORT_SYMBOL_GPL(devlink_free);
[2]
while true; do devlink dev reload netdevsim/netdevsim10 &> /dev/null; done &
while true; do echo "10 0" > /sys/bus/netdevsim/new_device; echo 10 > /sys/bus/netdevsim/del_device; done
[3]
[ 96.081096] ==================================================================
[ 96.082158] BUG: KASAN: slab-use-after-free in __mutex_lock+0x18a7/0x1ac0
[ 96.083107] Read of size 8 at addr ffff888109caa8d8 by task devlink/429
[ 96.084266] CPU: 0 PID: 429 Comm: devlink Not tainted 6.4.0-rc6-custom-gb01b0912311c-dirty #303
[ 96.085443] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
[ 96.086632] Call Trace:
[ 96.086990] <TASK>
[ 96.087314] dump_stack_lvl+0x91/0xf0
[ 96.087852] print_report+0xcf/0x660
[ 96.088384] ? __virt_addr_valid+0x86/0x360
[ 96.088980] ? __mutex_lock+0x18a7/0x1ac0
[ 96.089558] ? __mutex_lock+0x18a7/0x1ac0
[ 96.090132] kasan_report+0xd6/0x110
[ 96.090667] ? __mutex_lock+0x18a7/0x1ac0
[ 96.091245] __mutex_lock+0x18a7/0x1ac0
[ 96.091898] ? devlink_get_from_attrs_lock+0x2bc/0x460
[ 96.092645] ? mutex_lock_io_nested+0x18b0/0x18b0
[ 96.093323] ? reacquire_held_locks+0x4e0/0x4e0
[ 96.093972] ? devlink_try_get+0x158/0x1e0
[ 96.094586] ? devlink_get_from_attrs_lock+0x460/0x460
[ 96.095325] ? devlink_get_from_attrs_lock+0x2bc/0x460
[ 96.096053] devlink_get_from_attrs_lock+0x2bc/0x460
[ 96.096767] ? devlink_nl_post_doit+0xf0/0xf0
[ 96.097409] ? __nla_parse+0x40/0x50
[ 96.097943] ? devlink_get_from_attrs_lock+0x460/0x460
[ 96.098671] devlink_nl_pre_doit+0xb3/0x480
[ 96.099255] genl_family_rcv_msg_doit.isra.0+0x1b8/0x2e0
[ 96.099966] ? genl_start+0x670/0x670
[ 96.100487] ? ns_capable+0xda/0x110
[ 96.100991] genl_rcv_msg+0x558/0x7f0
[ 96.101516] ? genl_family_rcv_msg_doit.isra.0+0x2e0/0x2e0
[ 96.102260] ? devlink_get_from_attrs_lock+0x460/0x460
[ 96.102925] ? devlink_reload+0x540/0x540
[ 96.103429] ? devlink_pernet_pre_exit+0x340/0x340
[ 96.104017] netlink_rcv_skb+0x170/0x440
[ 96.104508] ? genl_family_rcv_msg_doit.isra.0+0x2e0/0x2e0
[ 96.105166] ? netlink_ack+0x1380/0x1380
[ 96.105662] ? lock_contended+0xc70/0xc70
[ 96.106172] ? rwsem_down_read_slowpath+0xda0/0xda0
[ 96.106767] ? netlink_deliver_tap+0x1b6/0xd90
[ 96.107317] ? is_vmalloc_addr+0x8b/0xb0
[ 96.107804] genl_rcv+0x2d/0x40
[ 96.108213] netlink_unicast+0x53f/0x810
[ 96.108698] ? netlink_attachskb+0x870/0x870
[ 96.109230] ? lock_release+0x3ac/0xbb0
[ 96.109714] netlink_sendmsg+0x95c/0xe80
[ 96.110206] ? netlink_unicast+0x810/0x810
[ 96.110711] ? __might_fault+0x15b/0x190
[ 96.111194] ? _copy_from_user+0x9f/0xd0
[ 96.111691] __sys_sendto+0x2aa/0x420
[ 96.112148] ? __ia32_sys_getpeername+0xb0/0xb0
[ 96.112706] ? reacquire_held_locks+0x4e0/0x4e0
[ 96.113275] ? rcu_is_watching+0x12/0xb0
[ 96.113769] ? blkcg_exit_disk+0x50/0x50
[ 96.114260] __x64_sys_sendto+0xe5/0x1c0
[ 96.114742] ? lockdep_hardirqs_on+0x7d/0x100
[ 96.115285] ? syscall_enter_from_user_mode+0x20/0x50
[ 96.115899] do_syscall_64+0x38/0x80
[ 96.116352] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 96.116964] RIP: 0033:0x7fa073f72fa7
[ 96.117417] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 80 3d 5d d6 0c 00 00 41 89 ca 74 10 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 c3 55 48 83 ec 30 44 89 4c 24 2c 4c 89 44
[ 96.119548] RSP: 002b:00007ffca2723938 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[ 96.120449] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fa073f72fa7
[ 96.121287] RDX: 0000000000000034 RSI: 000055883950a460 RDI: 0000000000000003
[ 96.122120] RBP: 0000558837836e60 R08: 00007fa074050200 R09: 000000000000000c
[ 96.122951] R10: 0000000000000000 R11: 0000000000000202 R12: 000055883950a2a0
[ 96.123789] R13: 000055883950a460 R14: 000055883784eeab R15: 000055883950a2a0
[ 96.124638] </TASK>
[ 96.125120] Allocated by task 209:
[ 96.125543] kasan_save_stack+0x33/0x50
[ 96.125567] kasan_set_track+0x25/0x30
[ 96.125587] __kasan_kmalloc+0x87/0x90
[ 96.125606] new_device_store+0x22d/0x6a0
[ 96.125625] bus_attr_store+0x7b/0xa0
[ 96.125641] sysfs_kf_write+0x11c/0x170
[ 96.125659] kernfs_fop_write_iter+0x3f7/0x600
[ 96.125676] vfs_write+0x680/0xe90
[ 96.125691] ksys_write+0x143/0x270
[ 96.125707] do_syscall_64+0x38/0x80
[ 96.125722] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 96.125947] Freed by task 209:
[ 96.126332] kasan_save_stack+0x33/0x50
[ 96.126355] kasan_set_track+0x25/0x30
[ 96.126374] kasan_save_free_info+0x2e/0x40
[ 96.126389] __kasan_slab_free+0x10a/0x180
[ 96.126409] __kmem_cache_free+0x8a/0x1a0
[ 96.126429] device_release+0xa6/0x240
[ 96.126449] kobject_put+0x1f7/0x5b0
[ 96.126465] device_unregister+0x34/0xc0
[ 96.126478] del_device_store+0x308/0x430
[ 96.126496] bus_attr_store+0x7b/0xa0
[ 96.126511] sysfs_kf_write+0x11c/0x170
[ 96.126528] kernfs_fop_write_iter+0x3f7/0x600
[ 96.126545] vfs_write+0x680/0xe90
[ 96.126560] ksys_write+0x143/0x270
[ 96.126575] do_syscall_64+0x38/0x80
[ 96.126590] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 96.126815] The buggy address belongs to the object at ffff888109caa800
which belongs to the cache kmalloc-1k of size 1024
[ 96.128252] The buggy address is located 216 bytes inside of
freed 1024-byte region [ffff888109caa800, ffff888109caac00)
[ 96.129875] The buggy address belongs to the physical page:
[ 96.130540] page:00000000c7912b23 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109ca8
[ 96.130562] head:00000000c7912b23 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 96.130575] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 96.130591] page_type: 0xffffffff()
[ 96.130607] raw: 0200000000010200 ffff888100041dc0 dead000000000100 dead000000000122
[ 96.130622] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
[ 96.130632] page dumped because: kasan: bad access detected
[ 96.130844] Memory state around the buggy address:
[ 96.131424] ffff888109caa780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 96.132266] ffff888109caa800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 96.133101] >ffff888109caa880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 96.133944] ^
[ 96.134666] ffff888109caa900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 96.135510] ffff888109caa980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 96.136351] ==================================================================
Powered by blists - more mailing lists