lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f747f03-72e8-84de-c1c1-297398ca516c@huawei.com>
Date:   Mon, 11 Jan 2021 17:38:39 +0800
From:   liulongfang <liulongfang@...wei.com>
To:     Alan Stern <stern@...land.harvard.edu>
CC:     <gregkh@...uxfoundation.org>, <linux-usb@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] USB:ehci:fix an interrupt calltrace error

On 2020/11/28 11:22, liulongfang Wrote:
> On 2020/11/27 23:47, Alan Stern Wrote:
>> On Fri, Nov 27, 2020 at 10:29:03AM +0800, liulongfang wrote:
>>> On 2020/11/27 0:08, Alan Stern Wrote:
>>>> On Thu, Nov 26, 2020 at 11:34:33AM +0800, Longfang Liu wrote:
>>>>> The system goes to suspend when using USB audio player. This causes
>>>>> the USB device continuous send interrupt signal to system, When the
>>>>> number of interrupts exceeds 100000, the system will forcibly close
>>>>> the interrupts and output a calltrace error.
>>>>
>>>> This description is very confusing.  USB devices do not send interrupt 
>>>> signals to the host.  Do you mean that the device sends a wakeup 
>>>> request?  Or do you mean something else?
>>> The irq type is IRQ_NONE,It's counted in the note_interrupt function.
>>> From the analysis of the driver code, that are indeed  interrupt signals.
>>
>> Above you wrote: "the USB device continuous send interrupt signal to 
>> system".  But that's not correct.  The interrupt signals are sent by the 
>> USB host controller, not by the USB audio device.
>>
> OK, I will modify the description in the next patch.
>> The patch description should mention that this happens only with some 
>> Synopsys host controllers.
>>
>>>>> When the system goes to suspend, the last interrupt is reported to
>>>>> the driver. At this time, the system has set the state to suspend.
>>>>> This causes the last interrupt to not be processed by the system and
>>>>> not clear the interrupt state flag. This uncleared interrupt flag
>>>>> constantly triggers new interrupt event. This causing the driver to
>>>>> receive more than 100,000 interrupts, which causes the system to
>>>>> forcibly close the interrupt report and report the calltrace error.
>>>>
>>>> If the driver receives an interrupt, it is supposed to process the event 
>>>> even if the host controller is suspended.  And when ehci_irq() runs, it 
>>>> clears the bits that are set in the USBSYS register.
>>> When the host controller is suspended, the ehci_suspend() will clear
>>> the HCD_FLAG_HW_ACCESSIBLE, and then usb_hcd_irq() will return IRQ_NONE
>>> directly without calling ehci_irq().
>>
>> Yes.  But ehci_bus_suspend() runs _before_ the host controller is 
>> suspended.  While ehci_bus_suspend() is running, usb_hcd_irq() _will_ 
>> call ehci_irq(), and ehci_irq() _will_ clear the status bits.
>>
>> After the host controller is suspended it is not supposed to generate 
>> any interrupt signals at all, because ehci_suspend() writes 0 to the 
>> USBINTR register, and it does this _before_ clearing 
>> HCD_FLAG_HW_ACCESSIBLE.
>>
> According to this process, there should be no interruption storm problem,
> but the current fact is that the problem has occurred, so the actual
> execution process did not follow the correct process above.
> 
>>>> Why is your system getting interrupts?  That is, which bits are set in 
>>>> the USBSTS register?
>>> BIT(5) and BIT(3) are setted, STS_IAA and STS_FLR.
>>
>> STS_FLR is not set in the USBINTR register, but STS_IAA is.  So that's 
>> the one which matters.
>>
>>>>> so, when the driver goes to sleep and changes the system state to
>>>>> suspend, the interrupt flag needs to be cleared.
>>>>>
>>>>> Signed-off-by: Longfang Liu <liulongfang@...wei.com>
>>>>> ---
>>>>>  drivers/usb/host/ehci-hub.c | 5 +++++
>>>>>  1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c
>>>>> index ce0eaf7..5b13825 100644
>>>>> --- a/drivers/usb/host/ehci-hub.c
>>>>> +++ b/drivers/usb/host/ehci-hub.c
>>>>> @@ -348,6 +348,11 @@ static int ehci_bus_suspend (struct usb_hcd *hcd)
>>>>>  
>>>>>  	/* Any IAA cycle that started before the suspend is now invalid */
>>>>>  	end_iaa_cycle(ehci);
>>>>> +
>>>>> +	/* clear interrupt status */
>>>>> +	if (ehci->has_synopsys_hc_bug)
>>>>> +		ehci_writel(ehci, INTR_MASK | STS_FLR, &ehci->regs->status);
>>>>
>>>> This is a very strange place to add your new code -- right in the middle 
>>>> of the IAA and unlink handling.  Why not put it in a more reasonable 
>>>> place?After the IAA is processed, clear the STS_IAA interrupt state flag.
>>>>
>>>> Also, the patch description does not mention has_synopsys_hc_bug.  The 
>>>> meaning of this flag has no connection with the interrupt status 
>>>> register, so why do you use it here?
>>> Because of our USB IP comes from Synopsys, and the uncleared flage is also caused by
>>> special hardware design, in addition, we have not tested other manufacturers' USB
>>> controllers.We don’t know if other manufacturers’ designs have this problem,
>>> so this modification is only limited to this kind of design.
>>
>> Clearing the STS_IAA flag won't hurt, no matter who manufactured the 
>> controller.  So your patch should look more like this:
>>
>> +	/* Some Synopsys controllers mistakenly leave IAA turned on */
>> +	ehci_writel(ehci, STS_IAA, &ehci->regs->status);
>>
>> And these lines should come before the "Any IAA cycle..." comment line.
>> Does that fix the problem?
> I will conduct a round of testing based on this modification
> and provide the test results.
>>
>> Alan Stern
>> .
>>
> Thanks.
> Longfang Liu
> .
> 
After continuous sleep and wake-up operation stress tests,
this problem can be solved by using a solution that
only cleans up STS_IAA
Thanks.
Longfang Liu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ