lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a901b8a-9052-41d9-a70d-76508ebd819b@linuxfoundation.org>
Date: Tue, 8 Jul 2025 12:16:48 -0600
From: Shuah Khan <skhan@...uxfoundation.org>
To: Zongmin Zhou <min_halo@....com>, Greg KH <gregkh@...uxfoundation.org>
Cc: shuah@...nel.org, valentina.manea.m@...il.com, i@...ithal.me,
 linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
 zhouzongmin@...inos.cn, Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [PATCH v2] usbip: convert to use faux_device

On 7/3/25 00:04, Zongmin Zhou wrote:
> 
> On 2025/7/3 07:54, Shuah Khan wrote:
>> On 7/1/25 20:12, Zongmin Zhou wrote:
>>>
>>> On 2025/7/2 06:56, Shuah Khan wrote:
>>>> On 6/23/25 21:21, Zongmin Zhou wrote:
>>>>>
>>>>> On 2025/6/21 01:26, Shuah Khan wrote:
>>>>>> On 6/20/25 03:27, Greg KH wrote:
>>>>>>> On Fri, Jun 20, 2025 at 05:19:33PM +0800, Zongmin Zhou wrote:
>>>>>>>>
>>>>>>>> On 2025/6/20 12:29, Greg KH wrote:
>>>>>>>>> On Fri, Jun 20, 2025 at 10:16:16AM +0800, Zongmin Zhou wrote:
>>>>>>>>>> On 2025/6/19 19:01, Greg KH wrote:
>>>>>>>>>>> On Wed, Jun 04, 2025 at 02:54:10PM +0800, Zongmin Zhou wrote:
>>>>>>>>>>>> From: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>>>>
>>>>>>>>>>>> The vhci driver does not need to create a platform device,
>>>>>>>>>>>> it only did so because it was simple to do that in order to
>>>>>>>>>>>> get a place in sysfs to hang some device-specific attributes.
>>>>>>>>>>>> Now the faux device interface is more appropriate,change it
>>>>>>>>>>>> over to use the faux bus instead.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>>>> ---
>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>> - don't change faux create api,just call probe on vhci_hcd_init.
>>>>>>>>>>>>
>>>>>>>>>>>>     drivers/usb/usbip/vhci.h             |  4 +-
>>>>>>>>>>>>     drivers/usb/usbip/vhci_hcd.c         | 86 +++++++++++-----------------
>>>>>>>>>>>>     drivers/usb/usbip/vhci_sysfs.c       | 68 +++++++++++-----------
>>>>>>>>>>>>     tools/usb/usbip/libsrc/vhci_driver.h |  2 +-
>>>>>>>>>>>>     4 files changed, 72 insertions(+), 88 deletions(-)
>>>>>>>>>>> I get the following build errors from this patch:
>>>>>>>>>>>
>>>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1462:12: error: ‘vhci_hcd_resume’ defined but not used [-Werror=unused-function]
>>>>>>>>>>>     1462 | static int vhci_hcd_resume(struct faux_device *fdev)
>>>>>>>>>>>          |            ^~~~~~~~~~~~~~~
>>>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1418:12: error: ‘vhci_hcd_suspend’ defined but not used [-Werror=unused-function]
>>>>>>>>>>>     1418 | static int vhci_hcd_suspend(struct faux_device *fdev, pm_message_t state)
>>>>>>>>>>>          |            ^~~~~~~~~~~~~~~~
>>>>>>>>>>> cc1: all warnings being treated as errors
>>>>>>>>>>>
>>>>>>>>>>> Are you sure you tested this?
>>>>>>>>>> I apologize for not enabling -Werror, which resulted in missing this error
>>>>>>>>>> warning.
>>>>>>>>>> I have tested usbip feature use the new patch,but not test system
>>>>>>>>>> suspend/resume.
>>>>>>>>>> The faux bus type don't add pm function,and vhci-hcd driver can't register
>>>>>>>>>> it.
>>>>>>>>>> Maybe have to add suspend/resume for it.like below:
>>>>>>>>>> static const struct bus_type faux_bus_type = {
>>>>>>>>>>       .name        = "faux",
>>>>>>>>>>       .match        = faux_match,
>>>>>>>>>>       .probe        = faux_probe,
>>>>>>>>>>       .remove        = faux_remove,
>>>>>>>>>>       .resume     = faux_resume,
>>>>>>>>>>       .suspend    = faux_suspend,
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> Is that right?
>>>>>>>>>> Your expertise would be greatly valued.
>>>>>>>>> As this is not real hardware, why do you need the suspend/resume
>>>>>>>>> callbacks at all?  What is happening here that requires them?
>>>>>>>> @greg,
>>>>>>>> The vhci_hcd_suspend/vhci_hcd_resume interfaces are not designed for this
>>>>>>>> faux device, but rather to
>>>>>>>> manipulate the HCD_FLAG_HW_ACCESSIBLE bit in the hcd flags associated with
>>>>>>>> the faux device.
>>>>>>>> For example:
>>>>>>>> During system standby: clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)
>>>>>>>> During system wakeup: set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)
>>>>>>>>
>>>>>>>> Previously, these two functions were registered through platform_driver,
>>>>>>>> but faux bus does not have the relevant interface, so they were not called,
>>>>>>>> resulting in this compilation warning error.
>>>>>>>>
>>>>>>>> This raises the question: Should the faux bus implement PM-related
>>>>>>>> interface?
>>>>>>>> I'm uncertain whether these functions are essential for the vhci-hcd driver
>>>>>>>> or if they can be safely removed.
>>>>>>>>
>>>>>>>> However, during system standby/wakeup tests with remote USB devices bound to
>>>>>>>> the vhci-hcd driver,
>>>>>>>> I observed consistent failure scenarios across both the original platform
>>>>>>>> bus and faux bus patch implementations.
>>>>>>
>>>>>> suspend and resume hooks have been in the code from beginning
>>>>>> in the CONFIG_PM path. The original authors are skeptical about
>>>>>> what should happen during suspend"
>>>>>>
>>>>>> /* what should happen for USB/IP under suspend/resume? */
>>>>>> suspend hook checks for active connections and sends EBBUSY
>>>>>> back to pm core.
>>>>>>
>>>>>> Active connection means any of the ports are in USB_PORT_STAT_CONNECTION
>>>>>> state. So as long as there are active connections between vhci client
>>>>>> and the host, suspend won't work. So we really don't know what happens
>>>>>> to the active connections if there are no suspend/resume hooks.
>>>>>>
>>>>>> If there are no active connections, then it will clear the HCD_FLAG_HW_ACCESSIBLE
>>>>>> bit and returns to pm core to continue with suspend.
>>>>>>
>>>>>> resume sets the HCD_FLAG_HW_ACCESSIBLE flag to indicate hardware is accessible
>>>>>> and initiates port status poll.
>>>>>>
>>>>>> - suspend hook prevents suspend
>>>>>>
>>>>>> With faux device since there is no suspend and resume hook, I would expect
>>>>>> these hooks are not a factor in suspend and resume.
>>>>>>
>>>>>> vhci_hcd is a special case virtual driver as it is a proxy for controlling
>>>>>> hardware on the host.
>>>>>>
>>>>>> Zongmin,
>>>>>>
>>>>>> Do suspend/resume work when vhci_hcd is not loaded?
>>>>>> What happens when when system suspends and resumes with faux device?
>>>>>> Can you send me dmesg logs and pm logs?
>>>>>>
>>>>> Hi Greg,shuah
>>>>>
>>>>> Yes, system with vhci_hcd unload or just load vhci_hcd driver
>>>>> without usbip devices bound to vhci_hcd,system suspend/resume worked well.
>>>>> Some logs for you.
>>>>
>>>> Sorry for the delay. I was at a conference last week.
>>>> Thank you for sending these logs.
>>>>
>>>>> 1.system with vhci_hcd driver load,but don't bound any usbip devices to vhci_hcd,suspend/resume worked well.
>>>>> see dmesg-faux bus-load.log
>>>>
>>>>
>>>>>
>>>>> 2.usbip device bound to vhci_hcd,and do system suspend action,suspend/resume worked failed.
>>>>> I observed two distinct failure scenario:
>>>>> Scenario 1: System failed to enter suspend state,and will auto resume;
>>>>
>>>>
>>>>> the log for use platform bus:
>>>>> dmesg-platform bus-device bound-auto resume.log
>>>>
>>>> This is clear from the suspend hook in the vhci_hcd.
>>>> When it returns EBUSY, suspend will fail and move to
>>>> auto resume.
>>>>
>>>>> the log for use faux bus:
>>>>> dmesg-faux bus-device bound-auto resume.log
>>>>
>>>> It is good that the behavior is same with faux bus even though
>>>> suspend hook isn't called. I will take a look at the log.
>>>>
>>>>>
>>>>> Scenario 2: System resume failed with black screen freeze,a forced restart of the machine is require.
>>>>> the log for use platform bus:
>>>>> dmesg-platform bus-device bound-black screen.log
>>>>> the log for use faux bus:
>>>>> dmesg-faux bus-device bound-black screen.log
>>>>
>>>> That is bad. When does this happen? Is there a difference in
>>>> configuration?
>>>>
>>> No, there's no difference. The same code is used for two different failure scenarios.
>>> I just tested many times. These two different situations occur probabilistically.
>>> But they both happened only when the USBIP device is bound to the vhci_hcd driver.
>>>> The behavior same for platform bus and faux bus. Sounds like
>>>> an existing problem that needs to be debugged separately.
>>>>
>>>> I will take a look at all these logs and get back to you in
>>>> a day or two.
>>>>
>>
>> I looked at the logs and here is what I found:
>>
>> Scenario 1:
>>   dmesg-faux bus-device bound-auto resume.log
>>   dmesg-platform bus-device bound-auto resume.log
>>
>> In this case suspend bailed out way before driver suspend
>> when vhci_hcd is using platform and faux bus.
>>
>> Freezing remaining freezable tasks failed after 20.006 seconds (0 tasks refusing to freeze, wq_busy=1)
>> Restarting kernel threads ... done
>> OOM killer enabled.
>> Restarting tasks ... done.
>> random: crng reseeded on system resumption
>> PM: suspend exit
>>
>> Auto-resume of the user-space worked. Scenario 1 isn't really
>> interesting.
>>
>> Scenario 2:
>>   dmesg-faux bus-device bound-black screen.log
>>   dmesg-platform bus-device bound-black screen.log
>>
>> Even though the result is the same in seeing blank screen, how
>> we get there is different.
>>
>> =================
>> faux-bus device:
>> =================
>> - suspend worked - looks like
>>   usb 4-1: PM: calling usb_dev_suspend @ 6054, parent: usb4
>>   usb 4-1: PM: usb_dev_suspend returned 0 after 13675 usecs
>>   usb usb4: PM: calling usb_dev_suspend @ 6055, parent: vhci_hcd.0
>>   usb usb4: PM: usb_dev_suspend returned 0 after 9 usecs
>>
>> vhci_hcd suspend isn't in play here. usb_dev_suspend() returns.
>> See below
>>
>> usb 4-1: PM: usb_dev_suspend returned message.
>>
>> -------------------------------------------------------------
>>
>> - resume started (assuming it has been initiated by user)
>>
>> [  650.027491][ T6056] pci 0000:00:01.0: PM: pci_pm_suspend_noirq returned 0 after 304 usecs
>>
>> See see timestamp difference between the last suspend message and the
>> first resume message.
>> [  674.000257][   T39] pci 0000:00:00.0: PM: calling pci_pm_resume_noirq @ 39, parent: pci0000:00
>>
>> usb 4-1 usb_dev_resume never returns.
>>
>> [  674.071125][ T6117] usb usb4: PM: usb_dev_resume returned 0 after 21110 usecs
>> [  674.113991][ T6126] usb 4-1: PM: calling usb_dev_resume @ 6126, parent: usb4
>>
>> -------------------------------------------------------------
>>
>> =====================
>> platform bus device
>> =====================
>>
>> - suspend was aborted because vhci_hcd suspend routine returned error
>>
>> [  297.854621][ T9402] usb 4-1: PM: calling usb_dev_suspend @ 9402, parent: usb4
>> [  297.868524][ T9402] usb 4-1: PM: usb_dev_suspend returned 0 after 13214 usecs
>> [  297.869994][ T9403] usb usb4: PM: calling usb_dev_suspend @ 9403, parent: vhci_hcd.0
>> [  297.871959][ T9403] usb usb4: PM: usb_dev_suspend returned 0 after 30 usecs
>> [  297.873644][ T9394] vhci_hcd vhci_hcd.0: PM: calling platform_pm_suspend @ 9394, parent: platform
>> [  297.874738][ T9394] vhci_hcd vhci_hcd.0: We have 1 active connection. Do not suspend.
>> [  297.875369][ T9394] vhci_hcd vhci_hcd.0: PM: dpm_run_callback(): platform_pm_suspend returns -16
>> [  297.876078][ T9394] vhci_hcd vhci_hcd.0: PM: platform_pm_suspend returned -16 after 1341 usecs
>> [  297.876774][ T9394] vhci_hcd vhci_hcd.0: PM: failed to suspend: error -16
>>
>> - the following triggers resume
>> [  297.877321][ T9394] PM: Some devices failed to suspend, or early wake event detected
>>
>> [  297.881065][ T9403] usb usb3: PM: usb_dev_resume returned 0 after 19 usecs
>> [  297.904551][ T9408] usb usb4: PM: usb_dev_resume returned 0 after 22911 usecs
>> [  297.905148][ T9418] usb 4-1: PM: calling usb_dev_resume @ 9418, parent: usb4
>>
>> usb 4-1 usb_dev_resume never returns.
>>
>> Note - In both cases, usb_dev_resume doesn't return. When it is called is the
>> difference.
>>
>> I don't think suspend/resume works when devices are bound. Suspend exits and
>> starts resume which seems to fail because it doesn't handle the virtual usb
>> device resume. There is a missing piece here.
>>
>> When vhci_hcd imports a device and acts as a proxy between the virtual mass
>> storage device (e.g in this case) - it appears suspend and resume are
>> handled as if it is a usb device. Maybe this is incorrect?
>>
>> usb_dev_suspend() works and usb_dev_resume() on these virtual usb devices?
>> Do we need to handle this in usb_dev_resume()?
>>
>> Talking out loud - I have to do some looking into.
>>
> Re:
> Yes, your analysis is completely correct.
> 
> In fact, I've experimented with adding PM hooks to the faux bus,
> and found that faux bus devices then behave identically to platform bus devices during suspend/resume.
> See the attachment.
> 

Thanks for checking this scenario. No surprises here.

> This is likely a historical legacy issue.

It is an existing problem in the way vhci_hcd and the bound devices
handle suspend/resume. vhci_hcd suspend assumes once it returns
"don't suspend" the rest works. However suspend for the bound device
runs first and a subsequent resume on it fails.

This problem needs fixing - I don't know yet how to.

> Regarding this matter, is there anything else you'd like me to assist with?
> 

One thing I am curious about is the status of the bound devices after
"forced restart of the machine" when you see blank screen or hang?

thanks,
-- Shuah

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ