linux-kernel - Re: [RFC] Question about reset order for xhci controller and pci

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b4176223-e44f-5454-0a02-b75a65384fa6@huawei.com>
Date:   Tue, 30 Apr 2019 14:41:43 +0800
From:   "Tangnianyao (ICT)" <tangnianyao@...wei.com>
To:     Alan Stern <stern@...land.harvard.edu>
CC:     <mathias.nyman@...el.com>, <gregkh@...uxfoundation.org>,
        <linux-usb@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] Question about reset order for xhci controller and pci



On 2019/4/29 22:06, Alan Stern wrote:

Hi, Alan

> On Mon, 29 Apr 2019, Tangnianyao (ICT) wrote:
> 
>> Using command "echo 1 > /sys/bus/pci/devices/0000:7a:02.0/reset"
>> on centos7.5 system to reset xhci.
>>
>> On 2019/4/26 11:07, Tangnianyao (ICT) wrote:
>>> Hi,all
>>>
>>> I've meet a problem about reset xhci and it may be caused by the
>>> reset order of pci and xhci.
>>> Using xhci-pci, when users send reset command in os(centos or red-hat os),
>>> it would first reset PCI device by pci_reset_function. During this
>>> process, it would disable BME(Bus Master Enable) and set BME=0, and
>>> then enable it and set BME=1.
>>> And then it comes to xhci reset process. First, it would send an
>>> endpoint stop command in xhci_urb_dequeue. However, this stop ep command
>>> fails to finish. The reason is that BME is set to 0 in former process and
>>> xhci RUN/STOP changes to 0, and when BME is set to 1 again, RUN/STOP doesn't
>>> recover to 1.
>>> I've checked BME behavior in xhci spec, it shows that "If the BME bit is set to 0
>>> when xHC is running, the xHC may treat this as a Host Controller Error, asserting
>>> HCE(1) and immediately halt(R/S=0 and HCH=1). Recovery from this state will
>>> require an HCRST." It seems that the stop ep command failure is reasonable.
>>> Maybe I've missed something and please let me know.
> 
> Your email subject says "Question about...".  What is the question?
> 
Sorry I didn't descibe it clearly.
When sending a reset command, now the reset order is first BME and then xhci.
BME reset would make xhci controller stop, resulting in xhci reset failure,
because it can't finish stop ep command in xhci_urb_dequeue.
I'm not sure if this situation is in expectation.

> Also, given that your question concerns what happens when you write to
> /sys/bus/pci/..., perhaps you should consider mailing it to some PCI
> maintainers as well as to the USB maintainers.
> ok, I will mailing it to PCI maintainer as well.

> Perhaps the reset was not meant to be used the way you are doing it.  
> A more conservative approach would be to unbind xhci-hcd from the 
> device before doing the reset and then rebind it afterward.
> 
> Alan Stern
>
>
> .
>

I think this approach not work. When reset BME, xhci controller is stopped and
can't recover even BME is set enable later. According to xhci spec, it's appropriate.
When rebind it afterward, with xhci stop, xhci driver would consider the
xhci controller already died and it fails to work again.

To recover xhci, now I rmmod xhci-pci.ko and then insmod it again.

Thanks,
Tang