lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160825234740.GA12850@archie.localdomain>
Date:   Fri, 26 Aug 2016 01:47:40 +0200
From:   Clemens Gruber <clemens.gruber@...ruber.com>
To:     Peter Chen <hzpeterchen@...il.com>
Cc:     linux-usb@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org,
        Clemens Gruber <clemens.gruber@...ruber.com>
Subject: Re: chipidea: udc: kernel panic in isr_setup_status_phase

On Wed, Aug 24, 2016 at 04:11:02PM +0800, Peter Chen wrote:
> UEI is an error interrupt, and software have not handled it, so it will
> not affect ci->status.
> 
> > Should we only call isr_tr_complete_handler if UI && !UEI ?
> > 
> > Or would adding a check for ci->status == NULL in isr_setup-status_phase
> > and returning an error code also be a good idea?
> 
> I agree with that.

OK I now return -EINVAL if (ci->status == NULL). This does fix the
kernel panic, but the usb0 interface stays down and does not work.
Should I send a patch to avoid the NULL pointer dereference now or after
we found the cause of ci->status being NULL in the first place?

> 
> > 
> > Do you have an idea what's going on there and why ci->status is NULL?
> > 
> 
> I can't understand it, the only possible is the last disconnect event
> (see ci_udc_vbus_session->_gadget_stop_activity) has scheduled very late
> due to vbus lowers very slow.

I now have more information about the two different behaviors.
I added some printk statements..

A) When it does not work:
ci_udc_vbus_session: is_active, gadget_ready=1
ci_udc_pullup: is_on=1
udc_irq: USBi_UI
isr_tr_complete_handler: when calling isr_setup_status_phase at i=8
 isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=TX
udc_irq: USBi_UI
isr_setup_packet_handler: USB_REQ_SET_ADDRESS, type=0, ci->status=NULL
 isr_setup_status_phase: ci: status is NULL, vbus_active=1, ep0_dir=RX
(This then repeats a few times, beginning from udc_irq)

B) When it works:
ci_udc_vbus_session: is_active=1 gadget_ready=1
ci_udc_pullup: is_on=1
udc_irq: USBi_SLI
_gadget_stop_activity
udc_irq: USBi_URI
udc_irq: USBi_PCI
udc_irq: USBi_UI
udc_irq: USBi_UI
_gadget_stop_activity
usb_ep_free_request
udc_irq: USBi_UI | USBi_URI
udc_irq: USBi_PCI
isr_setup_packet_handler: USB_REQ_SET_ADDRESS, ci->status is not NULL
udc_irq: USBi_UI
(The above repeats a few times from _gadget_stop_activity to USBi_UI)
(Then USBi_UI occurs many times)
configsfs-gadget gadget: high-speed config #1 ..
(More USBi_UI interrupts)
IPv6: ADDRCONF (NETDEV_CHANGE): usb0: link becomes ready

--

So, both cases are very different and avoiding that NULL pointer
dereference did only fix the kernel panic but not the problem with the
USB gadget not initializing correctly after plugging in.

In A) The USBi_UI interrupts shouldn't arrive that early, I suppose. If
they are the reason why the problem occured, the question is, what
triggered them?

Does the printk output give you more insight into the problem?

--

You mentioned the possibility that vbus lowers too slow, but vbus is
supplied externally by the host and the problem not only occurs when
the cable is plugged out and in again. Also at boot up when there were
no previous disconnect events.
Or did you mean something else with "vbus lowers too slow"?

Do you have any suggestions how to approach this problem further?
Other spots where adding a printk would be helpful to find out what's
causing this?

Regards,
Clemens

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ