lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Sat, 2 Dec 2023 10:21:37 +0800
From: Ding Hui <dinghui@...gfor.com.cn>
To: Shifeng Li <lishifeng@...gfor.com.cn>, saeedm@...dia.com,
 leon@...nel.org, davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, eranbe@...lanox.com, moshe@...lanox.com
Cc: netdev@...r.kernel.org, linux-rdma@...r.kernel.org,
 linux-kernel@...r.kernel.org, lishifeng1992@....com,
 Moshe Shemesh <moshe@...dia.com>
Subject: Re: [PATCH v3] net/mlx5e: Fix a race in command alloc flow

On 2023/11/30 11:05, Shifeng Li wrote:
> Fix a cmd->ent use after free due to a race on command entry.
> Such race occurs when one of the commands releases its last refcount and
> frees its index and entry while another process running command flush
> flow takes refcount to this command entry. The process which handles
> commands flush may see this command as needed to be flushed if the other
> process allocated a ent->idx but didn't set ent to cmd->ent_arr in
> cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into
> the spin lock.
> 
> [70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
> [70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361
> [70013.081968]
> [70013.082028] Workqueue: events aer_isr
> [70013.082053] Call Trace:
> [70013.082067]  dump_stack+0x8b/0xbb
> [70013.082086]  print_address_description+0x6a/0x270
> [70013.082102]  kasan_report+0x179/0x2c0
> [70013.082173]  mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
> [70013.082267]  mlx5_cmd_flush+0x80/0x180 [mlx5_core]
> [70013.082304]  mlx5_enter_error_state+0x106/0x1d0 [mlx5_core]
> [70013.082338]  mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core]
> [70013.082377]  remove_one+0x200/0x2b0 [mlx5_core]
> [70013.082409]  pci_device_remove+0xf3/0x280
> [70013.082439]  device_release_driver_internal+0x1c3/0x470
> [70013.082453]  pci_stop_bus_device+0x109/0x160
> [70013.082468]  pci_stop_and_remove_bus_device+0xe/0x20
> [70013.082485]  pcie_do_fatal_recovery+0x167/0x550
> [70013.082493]  aer_isr+0x7d2/0x960
> [70013.082543]  process_one_work+0x65f/0x12d0
> [70013.082556]  worker_thread+0x87/0xb50
> [70013.082571]  kthread+0x2e9/0x3a0
> [70013.082592]  ret_from_fork+0x1f/0x40
> 

It is better if you also put the diagram [1] in the commit log, that is easy to understand.

[1] https://www.spinics.net/lists/netdev/msg951955.html

> Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
> Reviewed-by: Moshe Shemesh <moshe@...dia.com>
> Signed-off-by: Shifeng Li <lishifeng@...gfor.com.cn>

-- 
Thanks,
- Ding Hui


Powered by blists - more mailing lists