lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4759911b-8c35-4ca9-bc34-09dd41b14582@163.com>
Date: Wed, 9 Jul 2025 17:07:24 +0800
From: Zongmin Zhou <min_halo@....com>
To: Shuah Khan <skhan@...uxfoundation.org>,
 Greg KH <gregkh@...uxfoundation.org>
Cc: shuah@...nel.org, valentina.manea.m@...il.com, i@...ithal.me,
 linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
 zhouzongmin@...inos.cn
Subject: Re: [PATCH v2] usbip: convert to use faux_device


On 2025/7/9 02:16, Shuah Khan wrote:
> On 7/3/25 00:04, Zongmin Zhou wrote:
>>
>> On 2025/7/3 07:54, Shuah Khan wrote:
>>> On 7/1/25 20:12, Zongmin Zhou wrote:
>>>>
>>>> On 2025/7/2 06:56, Shuah Khan wrote:
>>>>> On 6/23/25 21:21, Zongmin Zhou wrote:
>>>>>>
>>>>>> On 2025/6/21 01:26, Shuah Khan wrote:
>>>>>>> On 6/20/25 03:27, Greg KH wrote:
>>>>>>>> On Fri, Jun 20, 2025 at 05:19:33PM +0800, Zongmin Zhou wrote:
>>>>>>>>>
>>>>>>>>> On 2025/6/20 12:29, Greg KH wrote:
>>>>>>>>>> On Fri, Jun 20, 2025 at 10:16:16AM +0800, Zongmin Zhou wrote:
>>>>>>>>>>> On 2025/6/19 19:01, Greg KH wrote:
>>>>>>>>>>>> On Wed, Jun 04, 2025 at 02:54:10PM +0800, Zongmin Zhou wrote:
>>>>>>>>>>>>> From: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The vhci driver does not need to create a platform device,
>>>>>>>>>>>>> it only did so because it was simple to do that in order to
>>>>>>>>>>>>> get a place in sysfs to hang some device-specific attributes.
>>>>>>>>>>>>> Now the faux device interface is more appropriate,change it
>>>>>>>>>>>>> over to use the faux bus instead.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Zongmin Zhou <zhouzongmin@...inos.cn>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>>> - don't change faux create api,just call probe on 
>>>>>>>>>>>>> vhci_hcd_init.
>>>>>>>>>>>>>
>>>>>>>>>>>>>     drivers/usb/usbip/vhci.h |  4 +-
>>>>>>>>>>>>>     drivers/usb/usbip/vhci_hcd.c         | 86 
>>>>>>>>>>>>> +++++++++++-----------------
>>>>>>>>>>>>>     drivers/usb/usbip/vhci_sysfs.c       | 68 
>>>>>>>>>>>>> +++++++++++-----------
>>>>>>>>>>>>>     tools/usb/usbip/libsrc/vhci_driver.h |  2 +-
>>>>>>>>>>>>>     4 files changed, 72 insertions(+), 88 deletions(-)
>>>>>>>>>>>> I get the following build errors from this patch:
>>>>>>>>>>>>
>>>>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1462:12: error: 
>>>>>>>>>>>> ‘vhci_hcd_resume’ defined but not used 
>>>>>>>>>>>> [-Werror=unused-function]
>>>>>>>>>>>>     1462 | static int vhci_hcd_resume(struct faux_device 
>>>>>>>>>>>> *fdev)
>>>>>>>>>>>>          |            ^~~~~~~~~~~~~~~
>>>>>>>>>>>> drivers/usb/usbip/vhci_hcd.c:1418:12: error: 
>>>>>>>>>>>> ‘vhci_hcd_suspend’ defined but not used 
>>>>>>>>>>>> [-Werror=unused-function]
>>>>>>>>>>>>     1418 | static int vhci_hcd_suspend(struct faux_device 
>>>>>>>>>>>> *fdev, pm_message_t state)
>>>>>>>>>>>>          |            ^~~~~~~~~~~~~~~~
>>>>>>>>>>>> cc1: all warnings being treated as errors
>>>>>>>>>>>>
>>>>>>>>>>>> Are you sure you tested this?
>>>>>>>>>>> I apologize for not enabling -Werror, which resulted in 
>>>>>>>>>>> missing this error
>>>>>>>>>>> warning.
>>>>>>>>>>> I have tested usbip feature use the new patch,but not test 
>>>>>>>>>>> system
>>>>>>>>>>> suspend/resume.
>>>>>>>>>>> The faux bus type don't add pm function,and vhci-hcd driver 
>>>>>>>>>>> can't register
>>>>>>>>>>> it.
>>>>>>>>>>> Maybe have to add suspend/resume for it.like below:
>>>>>>>>>>> static const struct bus_type faux_bus_type = {
>>>>>>>>>>>       .name        = "faux",
>>>>>>>>>>>       .match        = faux_match,
>>>>>>>>>>>       .probe        = faux_probe,
>>>>>>>>>>>       .remove        = faux_remove,
>>>>>>>>>>>       .resume     = faux_resume,
>>>>>>>>>>>       .suspend    = faux_suspend,
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> Is that right?
>>>>>>>>>>> Your expertise would be greatly valued.
>>>>>>>>>> As this is not real hardware, why do you need the suspend/resume
>>>>>>>>>> callbacks at all?  What is happening here that requires them?
>>>>>>>>> @greg,
>>>>>>>>> The vhci_hcd_suspend/vhci_hcd_resume interfaces are not 
>>>>>>>>> designed for this
>>>>>>>>> faux device, but rather to
>>>>>>>>> manipulate the HCD_FLAG_HW_ACCESSIBLE bit in the hcd flags 
>>>>>>>>> associated with
>>>>>>>>> the faux device.
>>>>>>>>> For example:
>>>>>>>>> During system standby: clear_bit(HCD_FLAG_HW_ACCESSIBLE, 
>>>>>>>>> &hcd->flags)
>>>>>>>>> During system wakeup: set_bit(HCD_FLAG_HW_ACCESSIBLE, 
>>>>>>>>> &hcd->flags)
>>>>>>>>>
>>>>>>>>> Previously, these two functions were registered through 
>>>>>>>>> platform_driver,
>>>>>>>>> but faux bus does not have the relevant interface, so they 
>>>>>>>>> were not called,
>>>>>>>>> resulting in this compilation warning error.
>>>>>>>>>
>>>>>>>>> This raises the question: Should the faux bus implement 
>>>>>>>>> PM-related
>>>>>>>>> interface?
>>>>>>>>> I'm uncertain whether these functions are essential for the 
>>>>>>>>> vhci-hcd driver
>>>>>>>>> or if they can be safely removed.
>>>>>>>>>
>>>>>>>>> However, during system standby/wakeup tests with remote USB 
>>>>>>>>> devices bound to
>>>>>>>>> the vhci-hcd driver,
>>>>>>>>> I observed consistent failure scenarios across both the 
>>>>>>>>> original platform
>>>>>>>>> bus and faux bus patch implementations.
>>>>>>>
>>>>>>> suspend and resume hooks have been in the code from beginning
>>>>>>> in the CONFIG_PM path. The original authors are skeptical about
>>>>>>> what should happen during suspend"
>>>>>>>
>>>>>>> /* what should happen for USB/IP under suspend/resume? */
>>>>>>> suspend hook checks for active connections and sends EBBUSY
>>>>>>> back to pm core.
>>>>>>>
>>>>>>> Active connection means any of the ports are in 
>>>>>>> USB_PORT_STAT_CONNECTION
>>>>>>> state. So as long as there are active connections between vhci 
>>>>>>> client
>>>>>>> and the host, suspend won't work. So we really don't know what 
>>>>>>> happens
>>>>>>> to the active connections if there are no suspend/resume hooks.
>>>>>>>
>>>>>>> If there are no active connections, then it will clear the 
>>>>>>> HCD_FLAG_HW_ACCESSIBLE
>>>>>>> bit and returns to pm core to continue with suspend.
>>>>>>>
>>>>>>> resume sets the HCD_FLAG_HW_ACCESSIBLE flag to indicate hardware 
>>>>>>> is accessible
>>>>>>> and initiates port status poll.
>>>>>>>
>>>>>>> - suspend hook prevents suspend
>>>>>>>
>>>>>>> With faux device since there is no suspend and resume hook, I 
>>>>>>> would expect
>>>>>>> these hooks are not a factor in suspend and resume.
>>>>>>>
>>>>>>> vhci_hcd is a special case virtual driver as it is a proxy for 
>>>>>>> controlling
>>>>>>> hardware on the host.
>>>>>>>
>>>>>>> Zongmin,
>>>>>>>
>>>>>>> Do suspend/resume work when vhci_hcd is not loaded?
>>>>>>> What happens when when system suspends and resumes with faux 
>>>>>>> device?
>>>>>>> Can you send me dmesg logs and pm logs?
>>>>>>>
>>>>>> Hi Greg,shuah
>>>>>>
>>>>>> Yes, system with vhci_hcd unload or just load vhci_hcd driver
>>>>>> without usbip devices bound to vhci_hcd,system suspend/resume 
>>>>>> worked well.
>>>>>> Some logs for you.
>>>>>
>>>>> Sorry for the delay. I was at a conference last week.
>>>>> Thank you for sending these logs.
>>>>>
>>>>>> 1.system with vhci_hcd driver load,but don't bound any usbip 
>>>>>> devices to vhci_hcd,suspend/resume worked well.
>>>>>> see dmesg-faux bus-load.log
>>>>>
>>>>>
>>>>>>
>>>>>> 2.usbip device bound to vhci_hcd,and do system suspend 
>>>>>> action,suspend/resume worked failed.
>>>>>> I observed two distinct failure scenario:
>>>>>> Scenario 1: System failed to enter suspend state,and will auto 
>>>>>> resume;
>>>>>
>>>>>
>>>>>> the log for use platform bus:
>>>>>> dmesg-platform bus-device bound-auto resume.log
>>>>>
>>>>> This is clear from the suspend hook in the vhci_hcd.
>>>>> When it returns EBUSY, suspend will fail and move to
>>>>> auto resume.
>>>>>
>>>>>> the log for use faux bus:
>>>>>> dmesg-faux bus-device bound-auto resume.log
>>>>>
>>>>> It is good that the behavior is same with faux bus even though
>>>>> suspend hook isn't called. I will take a look at the log.
>>>>>
>>>>>>
>>>>>> Scenario 2: System resume failed with black screen freeze,a 
>>>>>> forced restart of the machine is require.
>>>>>> the log for use platform bus:
>>>>>> dmesg-platform bus-device bound-black screen.log
>>>>>> the log for use faux bus:
>>>>>> dmesg-faux bus-device bound-black screen.log
>>>>>
>>>>> That is bad. When does this happen? Is there a difference in
>>>>> configuration?
>>>>>
>>>> No, there's no difference. The same code is used for two different 
>>>> failure scenarios.
>>>> I just tested many times. These two different situations occur 
>>>> probabilistically.
>>>> But they both happened only when the USBIP device is bound to the 
>>>> vhci_hcd driver.
>>>>> The behavior same for platform bus and faux bus. Sounds like
>>>>> an existing problem that needs to be debugged separately.
>>>>>
>>>>> I will take a look at all these logs and get back to you in
>>>>> a day or two.
>>>>>
>>>
>>> I looked at the logs and here is what I found:
>>>
>>> Scenario 1:
>>>   dmesg-faux bus-device bound-auto resume.log
>>>   dmesg-platform bus-device bound-auto resume.log
>>>
>>> In this case suspend bailed out way before driver suspend
>>> when vhci_hcd is using platform and faux bus.
>>>
>>> Freezing remaining freezable tasks failed after 20.006 seconds (0 
>>> tasks refusing to freeze, wq_busy=1)
>>> Restarting kernel threads ... done
>>> OOM killer enabled.
>>> Restarting tasks ... done.
>>> random: crng reseeded on system resumption
>>> PM: suspend exit
>>>
>>> Auto-resume of the user-space worked. Scenario 1 isn't really
>>> interesting.
>>>
>>> Scenario 2:
>>>   dmesg-faux bus-device bound-black screen.log
>>>   dmesg-platform bus-device bound-black screen.log
>>>
>>> Even though the result is the same in seeing blank screen, how
>>> we get there is different.
>>>
>>> =================
>>> faux-bus device:
>>> =================
>>> - suspend worked - looks like
>>>   usb 4-1: PM: calling usb_dev_suspend @ 6054, parent: usb4
>>>   usb 4-1: PM: usb_dev_suspend returned 0 after 13675 usecs
>>>   usb usb4: PM: calling usb_dev_suspend @ 6055, parent: vhci_hcd.0
>>>   usb usb4: PM: usb_dev_suspend returned 0 after 9 usecs
>>>
>>> vhci_hcd suspend isn't in play here. usb_dev_suspend() returns.
>>> See below
>>>
>>> usb 4-1: PM: usb_dev_suspend returned message.
>>>
>>> -------------------------------------------------------------
>>>
>>> - resume started (assuming it has been initiated by user)
>>>
>>> [  650.027491][ T6056] pci 0000:00:01.0: PM: pci_pm_suspend_noirq 
>>> returned 0 after 304 usecs
>>>
>>> See see timestamp difference between the last suspend message and the
>>> first resume message.
>>> [  674.000257][   T39] pci 0000:00:00.0: PM: calling 
>>> pci_pm_resume_noirq @ 39, parent: pci0000:00
>>>
>>> usb 4-1 usb_dev_resume never returns.
>>>
>>> [  674.071125][ T6117] usb usb4: PM: usb_dev_resume returned 0 after 
>>> 21110 usecs
>>> [  674.113991][ T6126] usb 4-1: PM: calling usb_dev_resume @ 6126, 
>>> parent: usb4
>>>
>>> -------------------------------------------------------------
>>>
>>> =====================
>>> platform bus device
>>> =====================
>>>
>>> - suspend was aborted because vhci_hcd suspend routine returned error
>>>
>>> [  297.854621][ T9402] usb 4-1: PM: calling usb_dev_suspend @ 9402, 
>>> parent: usb4
>>> [  297.868524][ T9402] usb 4-1: PM: usb_dev_suspend returned 0 after 
>>> 13214 usecs
>>> [  297.869994][ T9403] usb usb4: PM: calling usb_dev_suspend @ 9403, 
>>> parent: vhci_hcd.0
>>> [  297.871959][ T9403] usb usb4: PM: usb_dev_suspend returned 0 
>>> after 30 usecs
>>> [  297.873644][ T9394] vhci_hcd vhci_hcd.0: PM: calling 
>>> platform_pm_suspend @ 9394, parent: platform
>>> [  297.874738][ T9394] vhci_hcd vhci_hcd.0: We have 1 active 
>>> connection. Do not suspend.
>>> [  297.875369][ T9394] vhci_hcd vhci_hcd.0: PM: dpm_run_callback(): 
>>> platform_pm_suspend returns -16
>>> [  297.876078][ T9394] vhci_hcd vhci_hcd.0: PM: platform_pm_suspend 
>>> returned -16 after 1341 usecs
>>> [  297.876774][ T9394] vhci_hcd vhci_hcd.0: PM: failed to suspend: 
>>> error -16
>>>
>>> - the following triggers resume
>>> [  297.877321][ T9394] PM: Some devices failed to suspend, or early 
>>> wake event detected
>>>
>>> [  297.881065][ T9403] usb usb3: PM: usb_dev_resume returned 0 after 
>>> 19 usecs
>>> [  297.904551][ T9408] usb usb4: PM: usb_dev_resume returned 0 after 
>>> 22911 usecs
>>> [  297.905148][ T9418] usb 4-1: PM: calling usb_dev_resume @ 9418, 
>>> parent: usb4
>>>
>>> usb 4-1 usb_dev_resume never returns.
>>>
>>> Note - In both cases, usb_dev_resume doesn't return. When it is 
>>> called is the
>>> difference.
>>>
>>> I don't think suspend/resume works when devices are bound. Suspend 
>>> exits and
>>> starts resume which seems to fail because it doesn't handle the 
>>> virtual usb
>>> device resume. There is a missing piece here.
>>>
>>> When vhci_hcd imports a device and acts as a proxy between the 
>>> virtual mass
>>> storage device (e.g in this case) - it appears suspend and resume are
>>> handled as if it is a usb device. Maybe this is incorrect?
>>>
>>> usb_dev_suspend() works and usb_dev_resume() on these virtual usb 
>>> devices?
>>> Do we need to handle this in usb_dev_resume()?
>>>
>>> Talking out loud - I have to do some looking into.
>>>
>> Re:
>> Yes, your analysis is completely correct.
>>
>> In fact, I've experimented with adding PM hooks to the faux bus,
>> and found that faux bus devices then behave identically to platform 
>> bus devices during suspend/resume.
>> See the attachment.
>>
>
> Thanks for checking this scenario. No surprises here.
Another part of my purpose in doing this is that the vhci-hcd driver seems
should still retain suspend/resume hooks. Therefore, the faux bus should
add corresponding hooks to allow the driver to call its own pm functions.
Though currently don't know how to fix this problem yet.
>
>> This is likely a historical legacy issue.
>
> It is an existing problem in the way vhci_hcd and the bound devices
> handle suspend/resume. vhci_hcd suspend assumes once it returns
> "don't suspend" the rest works. However suspend for the bound device
> runs first and a subsequent resume on it fails.
>
> This problem needs fixing - I don't know yet how to.
>
>> Regarding this matter, is there anything else you'd like me to assist 
>> with?
>>
>
> One thing I am curious about is the status of the bound devices after
> "forced restart of the machine" when you see blank screen or hang?
That's an excellent question.Another persistent problem has surfaced 
during troubleshooting:
After a USB/IP device is bound to the vhci-hcd driver, when the machine
reboots or performs a forced restart(include resume fail,forced restart),
attempting to re-bind the device to the vhci-hcd driver will generate 
the error message
"usbip: error: Attach Request for 4-2 failed - Device busy (exported)".
At this point, the device must first be explicitly detached
from the usbip-host driver before it can be bound again.

I implemented a shutdown hook (can refer to the attached file) in the 
vhci-hcd driver
to perform vhci_hcd remove before system shutdown.
This resolves failures where devices cannot rebind to
the vhci-hcd driver after normal reboots. However,
it remains ineffective for forced shutdown/reboot scenarios
since no shutdown functions are executed in such cases.

That's all details I've gathered yet.
>
> thanks,
> -- Shuah
View attachment "0001-driver-core-add-shutdown-hook-on-faux-bus.patch" of type "text/x-patch" (2793 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ