lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52b9fd0ab9d279c7da1e2c70e1f023c7e3333ad6.camel@kylinos.cn>
Date: Fri, 13 Sep 2024 09:51:40 +0800
From: duanchenghao <duanchenghao@...inos.cn>
To: Alan Stern <stern@...land.harvard.edu>
Cc: gregkh@...uxfoundation.org, pavel@....cz, linux-pm@...r.kernel.org, 
	niko.mauno@...sala.com, stanley_chang@...ltek.com, tj@...nel.org, 
	linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org
Subject: Re: [PATCH] USB: Fix the issue of task recovery failure caused by
 USB status when S4 wakes up

在 2024-09-12星期四的 11:00 -0400,Alan Stern写道:
> On Thu, Sep 12, 2024 at 11:21:26AM +0800, duanchenghao wrote:
> > 在 2024-09-11星期三的 10:40 -0400,Alan Stern写道:
> > > On Tue, Sep 10, 2024 at 05:36:56PM +0800, duanchenghao wrote:
> > > > S4 wakeup restores the image that was saved before the system
> > > > entered
> > > > the S4 sleep state.
> > > > 
> > > >     S4 waking up from hibernation
> > > >     =============================
> > > >     kernel initialization
> > > >     |   
> > > >     v   
> > > >     freeze user task and kernel thread
> > > >     |   
> > > >     v   
> > > >     load saved image
> > > >     |    
> > > >     v   
> > > >     freeze the peripheral device and controller
> > > >     (Check the HCD_FLAG_WAKEUP_ PENDING flag of the USB. If it
> > > > is
> > > > set,
> > > >      return to EBUSY and do not perform the following restore
> > > > image.)
> > > 
> > > Why is the flag set at this point?  It should not be; the device
> > > and 
> > > controller should have been frozen with wakeup disabled.
> > > 
> > This is check point, not set point.
> 
> Yes, I know that.  But when the flag was checked, why did the code
> find 
> that it was set?  The flag should have been clear.

Yes, the current issue is that during S4 testing, there is a
probabilistic scenario where clear_bit is not called after set_bit, or
clear_bit is called but does not execute after set_bit. Please refer to
the two modification points in the v2 patch for details, as both of
them can cause the current issue.

> 
> > > Is your problem related to the one discussed in this email
> > > thread?
> > > 
> > > https://lore.kernel.org/linux-usb/d8600868-6e4b-45ab-b328-852b6ac8ecb5@rowland.harvard.edu/
> > > 
> > > Would the suggestion I made there -- i.e., have the xhci-hcd 
> > > interrupt handler skip calling usb_hcd_resume_root_hub() if the
> > > root
> > > hub 
> > > was suspended with wakeup = 0 -- fix your problem?
> > 
> > Skipping usb_hcd_resume_root_hub() should generally be possible,
> > but
> > it's important to ensure that normal remote wakeup functionality is
> > not
> > compromised. Is it HUB_SUSPEND that the hub you are referring to is
> > in
> > a suspended state?
> 
> I don't understand this question.  hub_quiesce() gets called with 
> HUB_SUSPEND when the hub enters a suspended state.
> 
> You are correct about the need for normal remote wakeup to work 
> properly.  The interrupt handler should skip calling 
> usb_hcd_resume_root_hub() for port connect or disconnect changes and
> for 
> port overcurrent changes (when the root hub is suspended with wakeup
> = 
> 0).  But it should _not_ skip calling usb_hcd_resume_root_hub() for
> port 
> resume events.

The current issue arises when rh_state is detected as RH_SUSPEND and
usb_hcd_resume_root_hub() is called to resume the root hub. However,
there is no mutual exclusion between the suspend flag, set_bit, and
clear_bit, which can lead to two scenarios:

    1. After set_bit is called, the state of the USB device is modified
by another process to !USB_STATE_SUSPEND, preventing the hub's resume
from being executed, and consequently, clear_bit is not called again.

    2. In another scenario, during the hub resume process, after
HCD_FLAG_WAKEUP_PENDING is cleared by clear_bit, rh_state has not yet
been set to !RH_SUSPENDED. At this point, set_bit is executed, but
since the hub has already entered the running state, the clear_bit
associated with the resume operation is not executed.

Please review the v2 patch, where I have described both the logical
flow before the modification and the revised logical flow after the
modification.

> 
> Alan Stern


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ