lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a13cf53-ffed-4521-917e-9c2856a5e348@linuxfoundation.org>
Date: Wed, 2 Jul 2025 17:54:50 -0600
From: Shuah Khan <skhan@...uxfoundation.org>
To: Zongmin Zhou <min_halo@....com>, Greg KH <gregkh@...uxfoundation.org>
Cc: shuah@...nel.org, valentina.manea.m@...il.com, i@...ithal.me,
 linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
 zhouzongmin@...inos.cn, Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [PATCH v2] usbip: convert to use faux_device

On 7/1/25 20:12, Zongmin Zhou wrote:
> 
> On 2025/7/2 06:56, Shuah Khan wrote:
>> On 6/23/25 21:21, Zongmin Zhou wrote:
>>>
>>> On 2025/6/21 01:26, Shuah Khan wrote:
>>>> On 6/20/25 03:27, Greg KH wrote:
>>>>> On Fri, Jun 20, 2025 at 05:19:33PM +0800, Zongmin Zhou wrote:
>>>>>>
>>>>>> On 2025/6/20 12:29, Greg KH wrote:
>>>>>>> On Fri, Jun 20, 2025 at 10:16:16AM +0800, Zongmin Zhou wrote:
>>>>>>>> On 2025/6/19 19:01, Greg KH wrote:
>>>>>>>>> On Wed, Jun 04, 2025 at 02:54:10PM +0800, Zongmin Zhou wrote:
>>>>>>>>>> From: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>>
>>>>>>>>>> The vhci driver does not need to create a platform device,
>>>>>>>>>> it only did so because it was simple to do that in order to
>>>>>>>>>> get a place in sysfs to hang some device-specific attributes.
>>>>>>>>>> Now the faux device interface is more appropriate,change it
>>>>>>>>>> over to use the faux bus instead.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>> ---
>>>>>>>>>> Changes in v2:
>>>>>>>>>> - don't change faux create api,just call probe on vhci_hcd_init.
>>>>>>>>>>
>>>>>>>>>>     drivers/usb/usbip/vhci.h             |  4 +-
>>>>>>>>>>     drivers/usb/usbip/vhci_hcd.c         | 86 +++++++++++-----------------
>>>>>>>>>>     drivers/usb/usbip/vhci_sysfs.c       | 68 +++++++++++-----------
>>>>>>>>>>     tools/usb/usbip/libsrc/vhci_driver.h |  2 +-
>>>>>>>>>>     4 files changed, 72 insertions(+), 88 deletions(-)
>>>>>>>>> I get the following build errors from this patch:
>>>>>>>>>
>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1462:12: error: ‘vhci_hcd_resume’ defined but not used [-Werror=unused-function]
>>>>>>>>>     1462 | static int vhci_hcd_resume(struct faux_device *fdev)
>>>>>>>>>          |            ^~~~~~~~~~~~~~~
>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1418:12: error: ‘vhci_hcd_suspend’ defined but not used [-Werror=unused-function]
>>>>>>>>>     1418 | static int vhci_hcd_suspend(struct faux_device *fdev, pm_message_t state)
>>>>>>>>>          |            ^~~~~~~~~~~~~~~~
>>>>>>>>> cc1: all warnings being treated as errors
>>>>>>>>>
>>>>>>>>> Are you sure you tested this?
>>>>>>>> I apologize for not enabling -Werror, which resulted in missing this error
>>>>>>>> warning.
>>>>>>>> I have tested usbip feature use the new patch,but not test system
>>>>>>>> suspend/resume.
>>>>>>>> The faux bus type don't add pm function,and vhci-hcd driver can't register
>>>>>>>> it.
>>>>>>>> Maybe have to add suspend/resume for it.like below:
>>>>>>>> static const struct bus_type faux_bus_type = {
>>>>>>>>       .name        = "faux",
>>>>>>>>       .match        = faux_match,
>>>>>>>>       .probe        = faux_probe,
>>>>>>>>       .remove        = faux_remove,
>>>>>>>>       .resume     = faux_resume,
>>>>>>>>       .suspend    = faux_suspend,
>>>>>>>> };
>>>>>>>>
>>>>>>>> Is that right?
>>>>>>>> Your expertise would be greatly valued.
>>>>>>> As this is not real hardware, why do you need the suspend/resume
>>>>>>> callbacks at all?  What is happening here that requires them?
>>>>>> @greg,
>>>>>> The vhci_hcd_suspend/vhci_hcd_resume interfaces are not designed for this
>>>>>> faux device, but rather to
>>>>>> manipulate the HCD_FLAG_HW_ACCESSIBLE bit in the hcd flags associated with
>>>>>> the faux device.
>>>>>> For example:
>>>>>> During system standby: clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)
>>>>>> During system wakeup: set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)
>>>>>>
>>>>>> Previously, these two functions were registered through platform_driver,
>>>>>> but faux bus does not have the relevant interface, so they were not called,
>>>>>> resulting in this compilation warning error.
>>>>>>
>>>>>> This raises the question: Should the faux bus implement PM-related
>>>>>> interface?
>>>>>> I'm uncertain whether these functions are essential for the vhci-hcd driver
>>>>>> or if they can be safely removed.
>>>>>>
>>>>>> However, during system standby/wakeup tests with remote USB devices bound to
>>>>>> the vhci-hcd driver,
>>>>>> I observed consistent failure scenarios across both the original platform
>>>>>> bus and faux bus patch implementations.
>>>>
>>>> suspend and resume hooks have been in the code from beginning
>>>> in the CONFIG_PM path. The original authors are skeptical about
>>>> what should happen during suspend"
>>>>
>>>> /* what should happen for USB/IP under suspend/resume? */
>>>> suspend hook checks for active connections and sends EBBUSY
>>>> back to pm core.
>>>>
>>>> Active connection means any of the ports are in USB_PORT_STAT_CONNECTION
>>>> state. So as long as there are active connections between vhci client
>>>> and the host, suspend won't work. So we really don't know what happens
>>>> to the active connections if there are no suspend/resume hooks.
>>>>
>>>> If there are no active connections, then it will clear the HCD_FLAG_HW_ACCESSIBLE
>>>> bit and returns to pm core to continue with suspend.
>>>>
>>>> resume sets the HCD_FLAG_HW_ACCESSIBLE flag to indicate hardware is accessible
>>>> and initiates port status poll.
>>>>
>>>> - suspend hook prevents suspend
>>>>
>>>> With faux device since there is no suspend and resume hook, I would expect
>>>> these hooks are not a factor in suspend and resume.
>>>>
>>>> vhci_hcd is a special case virtual driver as it is a proxy for controlling
>>>> hardware on the host.
>>>>
>>>> Zongmin,
>>>>
>>>> Do suspend/resume work when vhci_hcd is not loaded?
>>>> What happens when when system suspends and resumes with faux device?
>>>> Can you send me dmesg logs and pm logs?
>>>>
>>> Hi Greg,shuah
>>>
>>> Yes, system with vhci_hcd unload or just load vhci_hcd driver
>>> without usbip devices bound to vhci_hcd,system suspend/resume worked well.
>>> Some logs for you.
>>
>> Sorry for the delay. I was at a conference last week.
>> Thank you for sending these logs.
>>
>>> 1.system with vhci_hcd driver load,but don't bound any usbip devices to vhci_hcd,suspend/resume worked well.
>>> see dmesg-faux bus-load.log
>>
>>
>>>
>>> 2.usbip device bound to vhci_hcd,and do system suspend action,suspend/resume worked failed.
>>> I observed two distinct failure scenario:
>>> Scenario 1: System failed to enter suspend state,and will auto resume;
>>
>>
>>> the log for use platform bus:
>>> dmesg-platform bus-device bound-auto resume.log
>>
>> This is clear from the suspend hook in the vhci_hcd.
>> When it returns EBUSY, suspend will fail and move to
>> auto resume.
>>
>>> the log for use faux bus:
>>> dmesg-faux bus-device bound-auto resume.log
>>
>> It is good that the behavior is same with faux bus even though
>> suspend hook isn't called. I will take a look at the log.
>>
>>>
>>> Scenario 2: System resume failed with black screen freeze,a forced restart of the machine is require.
>>> the log for use platform bus:
>>> dmesg-platform bus-device bound-black screen.log
>>> the log for use faux bus:
>>> dmesg-faux bus-device bound-black screen.log
>>
>> That is bad. When does this happen? Is there a difference in
>> configuration?
>>
> No, there's no difference. The same code is used for two different failure scenarios.
> I just tested many times. These two different situations occur probabilistically.
> But they both happened only when the USBIP device is bound to the vhci_hcd driver.
>> The behavior same for platform bus and faux bus. Sounds like
>> an existing problem that needs to be debugged separately.
>>
>> I will take a look at all these logs and get back to you in
>> a day or two.
>>

I looked at the logs and here is what I found:

Scenario 1:
   dmesg-faux bus-device bound-auto resume.log
   dmesg-platform bus-device bound-auto resume.log

In this case suspend bailed out way before driver suspend
when vhci_hcd is using platform and faux bus.

Freezing remaining freezable tasks failed after 20.006 seconds (0 tasks refusing to freeze, wq_busy=1)
Restarting kernel threads ... done
OOM killer enabled.
Restarting tasks ... done.
random: crng reseeded on system resumption
PM: suspend exit

Auto-resume of the user-space worked. Scenario 1 isn't really
interesting.

Scenario 2:
   dmesg-faux bus-device bound-black screen.log
   dmesg-platform bus-device bound-black screen.log

Even though the result is the same in seeing blank screen, how
we get there is different.

=================
faux-bus device:
=================
- suspend worked - looks like
   usb 4-1: PM: calling usb_dev_suspend @ 6054, parent: usb4
   usb 4-1: PM: usb_dev_suspend returned 0 after 13675 usecs
   usb usb4: PM: calling usb_dev_suspend @ 6055, parent: vhci_hcd.0
   usb usb4: PM: usb_dev_suspend returned 0 after 9 usecs

vhci_hcd suspend isn't in play here. usb_dev_suspend() returns.
See below

usb 4-1: PM: usb_dev_suspend returned message.

-------------------------------------------------------------

- resume started (assuming it has been initiated by user)

[  650.027491][ T6056] pci 0000:00:01.0: PM: pci_pm_suspend_noirq returned 0 after 304 usecs

See see timestamp difference between the last suspend message and the
first resume message.
[  674.000257][   T39] pci 0000:00:00.0: PM: calling pci_pm_resume_noirq @ 39, parent: pci0000:00

usb 4-1 usb_dev_resume never returns.

[  674.071125][ T6117] usb usb4: PM: usb_dev_resume returned 0 after 21110 usecs
[  674.113991][ T6126] usb 4-1: PM: calling usb_dev_resume @ 6126, parent: usb4

-------------------------------------------------------------

=====================
platform bus device
=====================

- suspend was aborted because vhci_hcd suspend routine returned error

[  297.854621][ T9402] usb 4-1: PM: calling usb_dev_suspend @ 9402, parent: usb4
[  297.868524][ T9402] usb 4-1: PM: usb_dev_suspend returned 0 after 13214 usecs
[  297.869994][ T9403] usb usb4: PM: calling usb_dev_suspend @ 9403, parent: vhci_hcd.0
[  297.871959][ T9403] usb usb4: PM: usb_dev_suspend returned 0 after 30 usecs
[  297.873644][ T9394] vhci_hcd vhci_hcd.0: PM: calling platform_pm_suspend @ 9394, parent: platform
[  297.874738][ T9394] vhci_hcd vhci_hcd.0: We have 1 active connection. Do not suspend.
[  297.875369][ T9394] vhci_hcd vhci_hcd.0: PM: dpm_run_callback(): platform_pm_suspend returns -16
[  297.876078][ T9394] vhci_hcd vhci_hcd.0: PM: platform_pm_suspend returned -16 after 1341 usecs
[  297.876774][ T9394] vhci_hcd vhci_hcd.0: PM: failed to suspend: error -16

- the following triggers resume
[  297.877321][ T9394] PM: Some devices failed to suspend, or early wake event detected

[  297.881065][ T9403] usb usb3: PM: usb_dev_resume returned 0 after 19 usecs
[  297.904551][ T9408] usb usb4: PM: usb_dev_resume returned 0 after 22911 usecs
[  297.905148][ T9418] usb 4-1: PM: calling usb_dev_resume @ 9418, parent: usb4

usb 4-1 usb_dev_resume never returns.

Note - In both cases, usb_dev_resume doesn't return. When it is called is the
difference.

I don't think suspend/resume works when devices are bound. Suspend exits and
starts resume which seems to fail because it doesn't handle the virtual usb
device resume. There is a missing piece here.

When vhci_hcd imports a device and acts as a proxy between the virtual mass
storage device (e.g in this case) - it appears suspend and resume are
handled as if it is a usb device. Maybe this is incorrect?

usb_dev_suspend() works and usb_dev_resume() on these virtual usb devices?
Do we need to handle this in usb_dev_resume()?

Talking out loud - I have to do some looking into.

thanks,
-- Shuah





Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ