linux-kernel - Re: [PATCH 1/9] usb: vhci-hcd: Prevent suspending virtually attached devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bc533603-c7e3-4661-9b1e-ed1818d9f7cc@collabora.com>
Date: Fri, 25 Jul 2025 13:20:01 +0300
From: Cristian Ciocaltea <cristian.ciocaltea@...labora.com>
To: Shuah Khan <skhan@...uxfoundation.org>,
 Alan Stern <stern@...land.harvard.edu>
Cc: Valentina Manea <valentina.manea.m@...il.com>,
 Shuah Khan <shuah@...nel.org>, Hongren Zheng <i@...ithal.me>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 "Brian G. Merrell" <bgmerrell@...ell.com>, kernel@...labora.com,
 Greg Kroah-Hartman <gregkh@...e.de>, linux-usb@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/9] usb: vhci-hcd: Prevent suspending virtually attached
 devices

On 7/18/25 9:41 AM, Cristian Ciocaltea wrote:
> Hi Alan, Shuah,
> 
> On 7/17/25 10:55 PM, Shuah Khan wrote:
>> On 7/17/25 12:26, Alan Stern wrote:
>>> On Thu, Jul 17, 2025 at 06:54:50PM +0300, Cristian Ciocaltea wrote:
>>>> The VHCI platform driver aims to forbid entering system suspend when at
>>>> least one of the virtual USB ports are bound to an active USB/IP
>>>> connection.
>>>>
>>>> However, in some cases, the detection logic doesn't work reliably, i.e.
>>>> when all devices attached to the virtual root hub have been already
>>>> suspended, leading to a broken suspend state, with unrecoverable resume.
>>>>
>>>> Ensure the attached devices do not enter suspend by setting the syscore
>>>> PM flag.
>>>>
>>>> Fixes: 04679b3489e0 ("Staging: USB/IP: add client driver")
>>>> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@...labora.com>
>>>> ---
>>>>   drivers/usb/usbip/vhci_hcd.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
>>>> index e70fba9f55d6a0edf3c5fde56a614dd3799406a1..762b60e10a9415e58147cde2f615045da5804a0e 100644
>>>> --- a/drivers/usb/usbip/vhci_hcd.c
>>>> +++ b/drivers/usb/usbip/vhci_hcd.c
>>>> @@ -765,6 +765,7 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
>>>>                    ctrlreq->wValue, vdev->rhport);
>>>>                 vdev->udev = usb_get_dev(urb->dev);
>>>> +            dev_pm_syscore_device(&vdev->udev->dev, true);
>>>>               usb_put_dev(old);
>>>>                 spin_lock(&vdev->ud.lock);
>>>> @@ -785,6 +786,7 @@ static int vhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
>>>>                       "Not yet?:Get_Descriptor to device 0 (get max pipe size)\n");
>>>>                 vdev->udev = usb_get_dev(urb->dev);
>>>> +            dev_pm_syscore_device(&vdev->udev->dev, true);
>>>>               usb_put_dev(old);
>>>>               goto out;
>>>
>>> This looks very strange indeed.
>>>
>>> First, why is vhci_urb_enqueue() the right place to do this?  I should
>>> think you would want to do this just once per device, at the time it is
>>> attached.  Not every time a new URB is enqueued.
>>
>> Correct. This isn't the right place to do this even if we want to go with
>> the option to prevent suspend. The possible place to do this would be
>> from rh_port_connect() in which case you will have access to usb_hcd device.
> 
> Oh, I chose to handle this in vhci_urb_enqueue() as it seemed to be the only
> place where vdev->udev gets assigned.  Now I wonder if that assignment
> should really be here, but probably I'm missing something obvious as I'm
> still in the process of getting familiar with the code base.

After digging a bit further into the process of establishing a virtual
connection, I concluded rh_port_connect() cannot be used for the syscore
flag setup since the usb_device instance part of struct vhci_device
(vdev->udev here) is not yet initialized at that time.  That is because this
function is called right after the userspace tool responsible for initiating
a new USB/IP attachment writes the remote device information into sysfs -
see attach_store() in drivers/usb/usbip/vhci_sysfs.c.

The earliest execution context providing access to usb_device seems to be
during the enumeration process, which would explain why the vdev->udev
assignment has been done in vhci_urb_enqueue().  This should be normally
needed only when handling USB_REQ_GET_DESCRIPTOR, but for some reason
there's a reassignment in the USB_REQ_SET_ADDRESS handler as well.  I'd
assume it's possible that usb_device instance may change after
USB_REQ_GET_DESCRIPTOR, which is supposed to always precede
USB_REQ_SET_ADDRESS.

However, in all my tests done so far the operations where performed on the
same instance, hence I'm not sure if this is a real possibility or just a
leftover from downstream/experimental code base.

To remain on the safe side, I'll keep both calls to dev_pm_syscore_device(),
while adding in each case a comment on top with the relevant information. 

>> This has to be undone from rh_port_disconnect(). Also how does this impact
>> the usbip_host - we still need to handle usbip_host suspend.
> 
> I've only addressed usbip_vhci suspend prevention at the moment, as that was
> supposed to work.
> 
>>>
>>> Second, how do these devices ever go back to being regular non-syscore
>>> things?
> 
> This only handles the client side, i.e. the virtually attached devices,
> hence I didn't pay much attention to undo the syscore thing (wasn't
> straightforward to accomplish via the URB handling path anyway).  I'll
> definitely fix this up by moving to rh_port_[dis]connect(), as Shuah
> suggested.
> 
>>> Third, if this change isn't merely a temporary placeholder, it certainly
>>> needs to have a comment in the code to explain what it does and why.
> 
> Indeed, will document this.
> 
>>> Fourth, does calling dev_pm_syscore_device() really prevent the device
>>> from going into suspend?  
> 
> Yes, this is managed by core PM infra which basically skips processing of
> any PM callbacks when the flag is set - e.g. see how dev->power.syscore is
> handled in device_suspend():
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c?h=v6.16-rc6#n1800
> 
>>> What about runtime suspend?  
> 
> I think that's a slightly different topic - so far I've been only focused on
> system suspend.
> 
>> And what good
>>> does it to do prevent the device from being suspended if the entire
>>> server gets suspended?
> 
> As mentioned above, the target for now is to unbreak the system suspend
> prevention on the client side.  The server side doesn't seem to support it.
> 
>>>
>>> Fifth, the patch description says the purpose is to prevent the server
>>> from going into system suspend.  
> 
> Hmm, I might need to improve the description in this case, as I only
> mentioned VHCI and the virtual hub/ports, hence I was referring to the
> client side only.
> 
>>> How does marking some devices with
>>> dev_pm_syscore_device() accomplish this?
> 
> Please check the link above.
> 
>>
>> We have been discussing suspend/resume and reboot behavior in another thread
>> that proposed converting vhci_hcd to use faux bus.
>>
>> In addition to what Alan is asking, To handle suspend/resume cleanly, the
>> following has to happen at a higher level:
>>
>> - Let the usbip hots host know client is suspending the connection.
>>   The physical device isn't suspended on the host.
>> - suspend the virtual devices and vhci_hcd
>>
>> Do the reverse to resume.
> 
> Right, I was actually looking into having a proper suspend/resume support
> rather than just preventing it on the client side (for now), but that's
> clearly not an easy task to accomplish, as it requires extending the USP/IP
> protocol and most probably also the user space tools.
> 
>>
>> I would say:
>>
>> - We don't want vhci_hcd and usbip_host preventing suspend
> 
> That's understandable, but currently vhci_hcd is supposed to prevent
> suspend, which mostly works but it's unreliable, hence my initial goal was
> to provide a simple fix for it before attempting to experiment with more
> invasive changes.
> 
>> - It might be cleaner and safer to detach the devices during
>>   suspend on both ends. This is similar to what happens now when
>>   usbip host and vhci_hcd are removed.
>> - Note that usbip_host and vhci_hcd don't fully support suspend and
>>   resume at the moment.
> 
> Thank you both for the initial feedback!
> 
> Regards,
> Cristian
>