[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e86141a8-a490-b100-15ca-5759ef00f2f8@linux.intel.com>
Date: Thu, 4 Jan 2018 15:38:29 +0200
From: Mathias Nyman <mathias.nyman@...ux.intel.com>
To: yinbo.zhu@....com, Felipe Balbi <balbi@...com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Mathias Nyman <mathias.nyman@...el.com>,
"open list:DESIGNWARE USB3 DRD IP DRIVER" <linux-usb@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>
Cc: xiaobo.xie@....com, jerry.huang@....com, ran.wang_1@....com
Subject: Re: [PATCH v4 1/3] usb: host: Implement workaround for Erratum
A-007463
On 19.12.2017 12:16, yinbo.zhu@....com wrote:
> From: yinbo.zhu <yinbo.zhu@....com>
>
> When a transaction error (defined in Section 4.10.2.3, "USB
> Transaction Error" of the xHCI Specification) occurs on the
> USB, the host controller reports this through a transfer
> event with the completion code "USB Transaction Error". When
> this happens, the endpoint is placed in the Halted state. In
> response, software must issue a Reset Endpoint command to
> transition the endpoint to the Stopped state. In order to
> restart the transfer, the driver can perform either of the
> following:
> a) Ring the doorbell again, which restarts the transfer from
> where it stopped, or
> b) Issue a Set TR (Transfer Ring) Dequeue Pointer command
> for the endpoint to start the transfer form a different
> Transfer Ring pointer Consider the following
> scenario:
> 1. The xHCI driver prepares a control transfer read to one
> of the device's control endpoints;
> 2. During the IN data stage, a transaction error occurs on
> the USB, causing a transfer event with the completion
> code "USB Transaction Error";
> 3. The driver issues a Reset Endpoint command;
> 4. The driver rings the doorbell of the control endpoint to
> resume the transfer. In this scenario, the controller
> may reverse the direction of the data stage from IN to OUT.
The xhci driver should not ring the doorbell without first issuing a
Set TR Dequeue Pointer command in the USB Transaction error case.
code flow:
handle_tx_event()
case COMP_USB_TRANSACTION_ERROR:
status = -EPROTO;
process_ctrl_td()
switch (trb_comp_code)
default:
if (!xhci_requires_manual_halt_cleanup() // FALSE
break;
/* else fall through */
case COMP_STALL_ERROR:
finish_td()
if (xhci_requires_manual_halt_cleanup()) // TRUE
xhci_cleanup_halted_endpoint(...,EP_HARD_RESET)
xhci_queue_reset_ep()
xhci_cleanup_stalled_ring()
xhci_find_new_dequeue_state()
xhci_queue_new_dequeue_state()
If you can see the driver ringing the doorbell before the
Set TR Dequeue Pointer command then there is some race going
on in the driver
> Instead of sending an ACK to the endpoint to poll for read
> data, it sends a Data Packet (DP) to the endpoint. It
> fetches the data from the data stage Transfer Request
> Block (TRB) that is being resumed, even though the data
> buffer is setup to receive data and not transmit it.
> NOTE
> This issue occurs only if the transaction error happens
> during an IN data stage. There is no issue if the transaction
> error happens during an OUT data stage.
Have you been able to trigger this issue?
>
> Impact: When this issue occurs, the device likely responds in
> one of the following ways:
> a) The device responds with a STALL because the data stage
> has unexpectedly changed directions. The controller then
> generates a Stall Error transfer event, to which software
> must issue a Reset Endpoint command followed by a Set TR
> Dequeue Pointer command pointing to a new Setup TRB to clear
> the STALL condition.
> b) The device does not respond to the inverted data stage and
> the transaction times out. The controller generates another
> USB Transaction Error transfer event, to which software
> likely performs a USB Reset to the device because it is
> unresponsive. It is not expected that any of these recovery
> steps will cause instability in the system because this
> recovery is part of a standard xHCI driver and could happen
> regardless of the defect. Another possible system-level
> impact is that the controller attempts to read from the
> memory location pointed at by the Data Stage TRB or a Normal
> TRB chained to it. associated with this TRB is intended to be
> written by the controller, but the controller reads from it
> instead. Normally, this does not cause a problem. However, if
> the system has some type of memory protection where this
> unexpected read is treated as a bus error,
> a problem. However, if the system has some type of memory
> it may cause the system to become unstable or to crash.
>
> Workaround: If a USB Transaction Error occurs during the IN
> data phase of a control transfer, the driver must use the
> Set TR Dequeue Pointer command to either restart the data
> Phase or restart the entire control transfer from the
> Setup phase.
The Set TR Dequeue pointer command should already be the default way
the xhci driver handles this now.
>
> Configs Affected:
> LS1021-20-22A-R1.0, LS1021-20-22A-R2.007463
>
> Signed-off-by: yinbo zhu <yinbo.zhu@....com>
> ---
> @@ -1951,14 +1953,30 @@ static int finish_td(struct xhci_hcd *xhci, struct xhci_td *td,
> if (trb_comp_code == COMP_STALL_ERROR ||
> xhci_requires_manual_halt_cleanup(xhci, ep_ctx,
> trb_comp_code)) {
> - /* Issue a reset endpoint command to clear the host side
> - * halt, followed by a set dequeue command to move the
> - * dequeue pointer past the TD.
> - * The class driver clears the device side halt later.
> + /*erratum A-007463:
> + *After transaction error, controller switches control transfer
> + *data stage from IN to OUT direction.
> */
> - xhci_cleanup_halted_endpoint(xhci, slot_id, ep_index,
> + remaining = EVENT_TRB_LEN(le32_to_cpu(event->transfer_len));
> + if (remaining && xhci_requires_manual_halt_cleanup(xhci, ep_ctx,
> + trb_comp_code) &&
> + (xhci->quirks & XHCI_REVERSE_IN_OUT)) {
> + memset(&deq_state, 0, sizeof(deq_state));
> + xhci_find_new_dequeue_state(xhci, slot_id,
> + ep_index, td->urb->stream_id, td, &deq_state);
> + xhci_queue_new_dequeue_state(xhci, slot_id, ep_index,
> + &deq_state);
> + xhci_ring_cmd_db(xhci);
Even if all these changes should be unnecessary the endpoint is still halted
here, and we would need a reset endpoint command here as well, right?
Has this new codepath been tested?
> + } else {
> + /* Issue a reset endpoint command to clear the host side
> + * halt, followed by a set dequeue command to move the
> + * dequeue pointer past the TD.
> + * The class driver clears the device side halt later.
> + */
> + xhci_cleanup_halted_endpoint(xhci, slot_id, ep_index,
> ep_ring->stream_id, td, ep_trb,
> EP_HARD_RESET);
-Mathias
Powered by blists - more mailing lists