lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5051e27a.2ba3.198fa7b5f31.Coremail.00107082@163.com>
Date: Sat, 30 Aug 2025 18:17:26 +0800 (CST)
From: "David Wang" <00107082@....com>
To: Michał Pecio <michal.pecio@...il.com>
Cc: WeitaoWang-oc@...oxin.com, mathias.nyman@...ux.intel.com,
	gregkh@...uxfoundation.org, linux-usb@...r.kernel.org,
	regressions@...ts.linux.dev, linux-kernel@...r.kernel.org,
	surenb@...gle.com, kent.overstreet@...ux.dev
Subject: Re: [REGRESSION 6.17-rc3] usb/xhci: possible memory leak after
 suspend/resume cycle.


At 2025-08-30 17:48:28, "Michał Pecio" <michal.pecio@...il.com> wrote:
>On Sat, 30 Aug 2025 02:13:54 +0800, David Wang wrote:
>> Hi,
>>
>> I have been watching kernel memory usage for drivers for a while, via /proc/allocinfo.
>> After upgrade to 6.17-rc3, I notice memory usage behavior changes for usb drivers:
>> 
>> Before rc3, after several suspend/resume cycles, usb devices's memory usage is very stable:
>> 
>>        40960        5 drivers/usb/host/xhci-mem.c:980 [xhci_hcd] func:xhci_alloc_virt_device 9
>>         1024        1 drivers/usb/host/xhci-mem.c:841 [xhci_hcd] func:xhci_alloc_tt_info 2
>>          320       10 drivers/usb/host/xhci-mem.c:461 [xhci_hcd] func:xhci_alloc_container_ctx 31
>>         1920       15 drivers/usb/host/xhci-mem.c:377 [xhci_hcd] func:xhci_ring_alloc 31
>>          112       12 drivers/usb/host/xhci-mem.c:49 [xhci_hcd] func:xhci_segment_alloc 32
>>         1792       28 drivers/usb/host/xhci-mem.c:38 [xhci_hcd] func:xhci_segment_alloc 59
>> 
>> But with rc3, the memory usage increase after each suspend/resume cycle: 
>> 
>> #1:
>>        49152        6 drivers/usb/host/xhci-mem.c:980 [xhci_hcd] func:xhci_alloc_virt_device 9
>>         1024        1 drivers/usb/host/xhci-mem.c:841 [xhci_hcd] func:xhci_alloc_tt_info 2
>>          384       12 drivers/usb/host/xhci-mem.c:461 [xhci_hcd] func:xhci_alloc_container_ctx 32
>>         2176       17 drivers/usb/host/xhci-mem.c:377 [xhci_hcd] func:xhci_ring_alloc 32
>>          128       14 drivers/usb/host/xhci-mem.c:49 [xhci_hcd] func:xhci_segment_alloc 34
>>         2048       32 drivers/usb/host/xhci-mem.c:38 [xhci_hcd] func:xhci_segment_alloc 61
>> #2:
>>        57344        7 drivers/usb/host/xhci-mem.c:980 [xhci_hcd] func:xhci_alloc_virt_device 13
>>         1024        1 drivers/usb/host/xhci-mem.c:841 [xhci_hcd] func:xhci_alloc_tt_info 3
>>          448       14 drivers/usb/host/xhci-mem.c:461 [xhci_hcd] func:xhci_alloc_container_ctx 46
>>         2432       19 drivers/usb/host/xhci-mem.c:377 [xhci_hcd] func:xhci_ring_alloc 43
>>          144       16 drivers/usb/host/xhci-mem.c:49 [xhci_hcd] func:xhci_segment_alloc 44
>>         2304       36 drivers/usb/host/xhci-mem.c:38 [xhci_hcd] func:xhci_segment_alloc 82
>> #3:
>>        65536        8 drivers/usb/host/xhci-mem.c:980 [xhci_hcd] func:xhci_alloc_virt_device 17
>>         1024        1 drivers/usb/host/xhci-mem.c:841 [xhci_hcd] func:xhci_alloc_tt_info 4
>>          512       16 drivers/usb/host/xhci-mem.c:461 [xhci_hcd] func:xhci_alloc_container_ctx 60
>>         2688       21 drivers/usb/host/xhci-mem.c:377 [xhci_hcd] func:xhci_ring_alloc 54
>>          160       18 drivers/usb/host/xhci-mem.c:49 [xhci_hcd] func:xhci_segment_alloc 54
>>         2560       40 drivers/usb/host/xhci-mem.c:38 [xhci_hcd] func:xhci_segment_alloc 103
>> 
>> The memory increasing pattern keeps going on for each suspend/resume afterwards, I am not
>> sure whether those memory would be released sometime later.
>> 
>> And in kernel log, two lines of error always showed up after suspend/resume:
>> 
>> 	[  295.613598] xhci_hcd 0000:03:00.0: dma_pool_destroy xHCI ring segments busy
>> 	[  295.613605] xhci_hcd 0000:03:00.0: dma_pool_destroy xHCI input/output contexts busy
>
>Hi,
>
>Good work, looks like suspend/resume is a little understested corner
>of this driver.
>
>Did you check whether the same leak occurs if you simply disconnect
>a device or if it's truly unique to suspend?
>
>> And bisect narrow down to commit 2eb03376151bb8585caa23ed2673583107bb5193(
>> "usb: xhci: Fix slot_id resource race conflict"):
>
>I see a trivial bug which everyone (myself included tbh) missed before.
>Does this help?
>
>diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
>index f11e13f9cdb4..f294032c2ad7 100644
>--- a/drivers/usb/host/xhci-mem.c
>+++ b/drivers/usb/host/xhci-mem.c
>@@ -932,7 +932,7 @@ void xhci_free_virt_device(struct xhci_hcd *xhci, struct xhci_virt_device *dev,
>  */
> static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_id)
> {
>-	struct xhci_virt_device *vdev;
>+	struct xhci_virt_device *vdev, *tmp_vdev;
> 	struct list_head *tt_list_head;
> 	struct xhci_tt_bw_info *tt_info, *next;
> 	int i;
>@@ -952,8 +952,8 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
> 		if (tt_info->slot_id == slot_id) {
> 			/* are any devices using this tt_info? */
> 			for (i = 1; i < HCS_MAX_SLOTS(xhci->hcs_params1); i++) {
>-				vdev = xhci->devs[i];
>-				if (vdev && (vdev->tt_info == tt_info))
>+				tmp_vdev = xhci->devs[i];
>+				if (tmp_vdev && (tmp_vdev->tt_info == tt_info))
> 					xhci_free_virt_devices_depth_first(
> 						xhci, i);

I confirmed this *silly* code is the root cause of this memory leak.
And I would suggest simpler code changes (which is what I was testing):  


diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 81eaad87a3d9..c4a6544aa107 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -962,7 +962,7 @@ static void xhci_free_virt_devices_depth_first(struct xhci_hcd *xhci, int slot_i
 out:
        /* we are now at a leaf device */
        xhci_debugfs_remove_slot(xhci, slot_id);
-       xhci_free_virt_device(xhci, vdev, slot_id);
+       xhci_free_virt_device(xhci, xhci->devs[slot_id], slot_id);
 }
 
 int xhci_alloc_virt_device(struct xhci_hcd *xhci, int slot_id,



Thanks
David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ