[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACGkMEufuXdLKXx9GuEOnBnREz622f=FVt-0r3UBNUKWz_Q78g@mail.gmail.com>
Date: Mon, 14 Jul 2025 17:00:34 +0800
From: Jason Wang <jasowang@...hat.com>
To: Dragos Tatulea <dtatulea@...dia.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
Eugenio Pérez <eperezma@...hat.com>,
Wenli Quan <wquan@...hat.com>, Tariq Toukan <tariqt@...dia.com>, Cosmin Ratiu <cratiu@...dia.com>,
virtualization@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH vhost] vdpa/mlx5: Fix release of uninitialized resources
on error path
On Tue, Jul 8, 2025 at 8:05 PM Dragos Tatulea <dtatulea@...dia.com> wrote:
>
> The commit in the fixes tag made sure that mlx5_vdpa_free()
> is the single entrypoint for removing the vdpa device resources
> added in mlx5_vdpa_dev_add(), even in the cleanup path of
> mlx5_vdpa_dev_add().
>
> This means that all functions from mlx5_vdpa_free() should be able to
> handle uninitialized resources. This was not the case though:
> mlx5_vdpa_destroy_mr_resources() and mlx5_cmd_cleanup_async_ctx()
> were not able to do so. This caused the splat below when adding
> a vdpa device without a MAC address.
>
> This patch fixes these remaining issues:
>
> - Makes mlx5_vdpa_destroy_mr_resources() return early if called on
> uninitialized resources.
>
> - Moves mlx5_cmd_init_async_ctx() early on during device addition
> because it can't fail. This means that mlx5_cmd_cleanup_async_ctx()
> also can't fail. To mirror this, move the call site of
> mlx5_cmd_cleanup_async_ctx() in mlx5_vdpa_free().
>
> An additional comment was added in mlx5_vdpa_free() to document
> the expectations of functions called from this context.
>
> Splat:
>
> mlx5_core 0000:b5:03.2: mlx5_vdpa_dev_add:3950:(pid 2306) warning: No mac address provisioned?
> ------------[ cut here ]------------
> WARNING: CPU: 13 PID: 2306 at kernel/workqueue.c:4207 __flush_work+0x9a/0xb0
> [...]
> Call Trace:
> <TASK>
> ? __try_to_del_timer_sync+0x61/0x90
> ? __timer_delete_sync+0x2b/0x40
> mlx5_vdpa_destroy_mr_resources+0x1c/0x40 [mlx5_vdpa]
> mlx5_vdpa_free+0x45/0x160 [mlx5_vdpa]
> vdpa_release_dev+0x1e/0x50 [vdpa]
> device_release+0x31/0x90
> kobject_cleanup+0x37/0x130
> mlx5_vdpa_dev_add+0x327/0x890 [mlx5_vdpa]
> vdpa_nl_cmd_dev_add_set_doit+0x2c1/0x4d0 [vdpa]
> genl_family_rcv_msg_doit+0xd8/0x130
> genl_family_rcv_msg+0x14b/0x220
> ? __pfx_vdpa_nl_cmd_dev_add_set_doit+0x10/0x10 [vdpa]
> genl_rcv_msg+0x47/0xa0
> ? __pfx_genl_rcv_msg+0x10/0x10
> netlink_rcv_skb+0x53/0x100
> genl_rcv+0x24/0x40
> netlink_unicast+0x27b/0x3b0
> netlink_sendmsg+0x1f7/0x430
> __sys_sendto+0x1fa/0x210
> ? ___pte_offset_map+0x17/0x160
> ? next_uptodate_folio+0x85/0x2b0
> ? percpu_counter_add_batch+0x51/0x90
> ? filemap_map_pages+0x515/0x660
> __x64_sys_sendto+0x20/0x30
> do_syscall_64+0x7b/0x2c0
> ? do_read_fault+0x108/0x220
> ? do_pte_missing+0x14a/0x3e0
> ? __handle_mm_fault+0x321/0x730
> ? count_memcg_events+0x13f/0x180
> ? handle_mm_fault+0x1fb/0x2d0
> ? do_user_addr_fault+0x20c/0x700
> ? syscall_exit_work+0x104/0x140
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f0c25b0feca
> [...]
> ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
> Fixes: 83e445e64f48 ("vdpa/mlx5: Fix error path during device add")
> Reported-by: Wenli Quan <wquan@...hat.com>
> Closes: https://lore.kernel.org/virtualization/CADZSLS0r78HhZAStBaN1evCSoPqRJU95Lt8AqZNJ6+wwYQ6vPQ@mail.gmail.com/
> Reviewed-by: Tariq Toukan <tariqt@...dia.com>
> Reviewed-by: Cosmin Ratiu <cratiu@...dia.com>
Acked-by: Jason Wang <jasowang@...hat.com>
Thanks
Powered by blists - more mailing lists