linux-kernel - Re: XHCI vs PCM2903B/PCM2904 part 2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6917929c7dd7786b5b673743ce45bbcd56e6b1f1.camel@surriel.com>
Date:   Mon, 29 Jun 2020 23:55:02 -0400
From:   Rik van Riel <riel@...riel.com>
To:     Mathias Nyman <mathias.nyman@...ux.intel.com>,
        Alan Stern <stern@...land.harvard.edu>
Cc:     linux-usb <linux-usb@...r.kernel.org>, alsa-devel@...a-project.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Mathias Nyman <mathias.nyman@...el.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jaroslav Kysela <perex@...ex.cz>, Takashi Iwai <tiwai@...e.com>
Subject: Re: XHCI vs PCM2903B/PCM2904 part 2

On Mon, 2020-06-29 at 23:21 -0400, Rik van Riel wrote:

> > Could you add the code below and take new traces, it will show the
> > endpoint
> > state after the Babble error.
> 
> Hi Mathias,
> 
> I have finally rebooted into a kernel with your tracepoint.
> After a babble error, I get the following info in the trace.
> 
> [  556.716334] xhci_hcd 0000:00:14.0: Babble error for slot 13 ep 8
> on
> endpoint
> 
>  28672.016 :0/0 xhci-hcd:xhci_handle_tx_event(info: 196609, info2:
> 12845096, deq: 69501877488, tx_info: 12845252)
>  34816.037 :0/0 xhci-hcd:xhci_handle_tx_event(info: 196609, info2:
> 12845096, deq: 69501877856, tx_info: 12845252)
>  38912.043 :0/0 xhci-hcd:xhci_handle_tx_event(info: 196609, info2:
> 12845096, deq: 69501870176, tx_info: 12845252)

OK, this is strange indeed.
info: 0x30001
info2: 0xc40028
tx_info: c400c4

That suggests the device state is EP_STATE_DISABLED, but
we never got the error from the EP_STATE_DISABLED test near
the start of handle_tx_event(). If we had, the big switch
statement containing the code below would have been bypassed.

Unless I am mistaken, does that mean the endpoint context
(*ep_ctx) got modified while the code was in the middle of
handle_tx_event()?

What would cause that? A subsequent transfer to an endpoint
while it is in EP_STATE_HALTED, which the comment suggests 
is the expected endpoint state for a babble error?

> > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-
> > ring.c
> > index 0fda0c0f4d31..373d89ef7275 100644
> > --- a/drivers/usb/host/xhci-ring.c
> > +++ b/drivers/usb/host/xhci-ring.c
> > @@ -2455,6 +2455,7 @@ static int handle_tx_event(struct xhci_hcd
> > *xhci,
> >  	case COMP_BABBLE_DETECTED_ERROR:
> >  		xhci_dbg(xhci, "Babble error for slot %u ep %u on
> > endpoint\n",
> >  			 slot_id, ep_index);
> > +		trace_xhci_handle_tx_event(ep_ctx);
> >  		status = -EOVERFLOW;
> >  		break;
> >  	/* Completion codes for endpoint error state */
> > diff --git a/drivers/usb/host/xhci-trace.h b/drivers/usb/host/xhci-
> > trace.h
> > index b19582b2a72c..5081df079f4a 100644
> > --- a/drivers/usb/host/xhci-trace.h
> > +++ b/drivers/usb/host/xhci-trace.h
> > @@ -360,6 +360,11 @@ DEFINE_EVENT(xhci_log_ep_ctx,
> > xhci_add_endpoint,
> >  	TP_ARGS(ctx)
> >  );
> >  
> > +DEFINE_EVENT(xhci_log_ep_ctx, xhci_handle_tx_event,
> > +	TP_PROTO(struct xhci_ep_ctx *ctx),
> > +	TP_ARGS(ctx)
> > +);
> > +
> >  DECLARE_EVENT_CLASS(xhci_log_slot_ctx,
> >  	TP_PROTO(struct xhci_slot_ctx *ctx),
> >  	TP_ARGS(ctx),
> > 
> > 
> > 
-- 
All Rights Reversed.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)