lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMz4kuKPcA66ndU-ifyxrM+img83hPrixO0u+keZBojFF7ypYA@mail.gmail.com>
Date:   Fri, 14 Oct 2016 15:03:06 +0800
From:   Baolin Wang <baolin.wang@...aro.org>
To:     Felipe Balbi <balbi@...nel.org>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Mark Brown <broonie@...nel.org>,
        USB <linux-usb@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] usb: dwc3: gadget: Wait for end transfer complete
 before free irq

Hi Felipe,

On 13 October 2016 at 21:34, Felipe Balbi <balbi@...nel.org> wrote:
>
>
> Hi,
>
> Baolin Wang <baolin.wang@...aro.org> writes:
>>> Baolin Wang <baolin.wang@...aro.org> writes:
>>>>>> Baolin Wang <baolin.wang@...aro.org> writes:
>>>>>>>>>> I'm thinking this is a bug in configfs interface of Gadget API, not
>>>>>>>>>> dwc3. The only reason for this to happen would be if we get to
>>>>>>>>>> ->udc_stop() with endpoints still enabled.
>>>>>>>>>>
>>>>>>>>>> Can you check if that's the case? i.e. can you check if any endpoints
>>>>>>>>>> are still enabled when we get here?
>>>>>>>>>
>>>>>>>>> No, it is not any endpoints are still enabled. Like I said in commit
>>>>>>>>> message, we will start end transfer command when disable the endpoint,
>>>>>>>>> if the end transfer command complete event comes after we have freed
>>>>>>>>> the gadget irq, it will trigger the interrupt line all the time to
>>>>>>>>> crash the system.
>>>>>>>>
>>>>>>>> I see what the problem is. Databook tells us we *must* set CMDIOC when
>>>>>>>> issuing EndTransfer command and we should always wait for Command
>>>>>>>> Complete IRQ. However, we took a shortcut and just delayed for 100us
>>>>>>>> after issuing End Transfer.
>>>>>>>
>>>>>>> Yes, but 100us delay is not enough in some scenarios, like changing
>>>>>>> function with configfs frequently, we will met problems.
>>>>>>
>>>>>> heh, 100us *is* enough. However we still get an IRQ because we requested
>>>>>> for it. If you want to fix this, then you need to find a way to get rid
>>>>>> of that 100us udelay() and add a proper wait_for_completion() to delay
>>>>>> execution until command complete IRQ fires up.
>>>>>
>>>>> I haven't tested this yet, but it shows the idea (I think we might still
>>>>> have a race with ep_queue if we're still waiting for End Transfer, but
>>>>
>>>> Yes, maybe we need check DWC3_EP_END_TRANSFER_PENDING flag when
>>>> queuing one request.
>>>
>>> Yeah, I'll add that check later. I still need to make sure we would
>>> still try to kick any transfers that might have been queued while
>>> waiting for End Transfer Command Complete IRQ.
>>
>> OK. But I still worried if there are other races in some places which
>
> There are no other places where this needs to be sorted out.
>
>> is not easy to find out by testing. No introducing race condition
>> would be one better solution, anyway I would like to test the patch
>> you send out firstly.
>
> Right, but your patch was working around another problem, rather then
> fixing it.
>
> Here's another version which should make sure everything remains working
> as it should.
>
> 8<------------------------------------------------------------------------
> From 1f922a902a21d5cee32082be18bbfbf1d46137e4 Mon Sep 17 00:00:00 2001
> From: Felipe Balbi <felipe.balbi@...ux.intel.com>
> Date: Thu, 13 Oct 2016 14:09:47 +0300
> Subject: [PATCH] usb: dwc3: gadget: wait for End Transfer to complete
>
> Instead of just delaying for 100us, we should
> actually wait for End Transfer Command Complete
> interrupt before moving on. Note that this should
> only be done if we're dealing with one of the core
> revisions that actually require the interrupt before
> moving on.
>
> Reported-by: Baolin Wang <baolin.wang@...aro.org>
> Signed-off-by: Felipe Balbi <felipe.balbi@...ux.intel.com>
> ---
>  drivers/usb/dwc3/core.h   | 10 +++++++++-
>  drivers/usb/dwc3/gadget.c | 31 ++++++++++++++++++++++++++++---
>  2 files changed, 37 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index e878366ead00..cf495d932252 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -26,6 +26,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/mm.h>
>  #include <linux/debugfs.h>
> +#include <linux/wait.h>
>
>  #include <linux/usb/ch9.h>
>  #include <linux/usb/gadget.h>
> @@ -504,6 +505,7 @@ struct dwc3_event_buffer {
>   * @endpoint: usb endpoint
>   * @pending_list: list of pending requests for this endpoint
>   * @started_list: list of started requests on this endpoint
> + * @wait_end_transfer: wait_queue_head_t for waiting on End Transfer complete
>   * @lock: spinlock for endpoint request queue traversal
>   * @regs: pointer to first endpoint register
>   * @trb_pool: array of transaction buffers
> @@ -529,6 +531,8 @@ struct dwc3_ep {
>         struct list_head        pending_list;
>         struct list_head        started_list;
>
> +       wait_queue_head_t       wait_end_transfer;
> +
>         spinlock_t              lock;
>         void __iomem            *regs;
>
> @@ -545,7 +549,8 @@ struct dwc3_ep {
>  #define DWC3_EP_BUSY           (1 << 4)
>  #define DWC3_EP_PENDING_REQUEST        (1 << 5)
>  #define DWC3_EP_MISSED_ISOC    (1 << 6)
> -
> +#define DWC3_EP_END_TRANSFER_PENDING (1 << 7)
> +#define DWC3_EP_KICK_POST_END_TRANSFER (1 << 8)
>         /* This last one is specific to EP0 */
>  #define DWC3_EP0_DIR_IN                (1 << 31)
>
> @@ -1047,6 +1052,9 @@ struct dwc3_event_depevt {
>  #define DEPEVT_TRANSFER_BUS_EXPIRY     2
>
>         u32     parameters:16;
> +
> +/* For Command Complete Events */
> +#define DEPEVT_PARAMETER_CMD(n) (((n) & (0xf << 24)) >> 24)
>  } __packed;
>
>  /**
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 3ba05b12e49a..a3f81b5ba901 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -598,6 +598,8 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep,
>                 reg |= DWC3_DALEPENA_EP(dep->number);
>                 dwc3_writel(dwc->regs, DWC3_DALEPENA, reg);
>
> +               init_waitqueue_head(&dep->wait_end_transfer);
> +
>                 if (usb_endpoint_xfer_control(desc))
>                         return 0;
>
> @@ -1151,6 +1153,11 @@ static int __dwc3_gadget_ep_queue(struct dwc3_ep *dep, struct dwc3_request *req)
>         if (!dwc3_calc_trbs_left(dep))
>                 return 0;
>
> +       if (dep->flags & DWC3_EP_END_TRANSFER_PENDING) {
> +               dep->flags &= DWC3_EP_KICK_POST_END_TRANSFER;
> +               return 0;
> +       }
> +
>         ret = __dwc3_gadget_kick_transfer(dep, 0);
>         if (ret && ret != -EBUSY)
>                 dwc3_trace(trace_dwc3_gadget,
> @@ -2156,6 +2163,7 @@ static void dwc3_endpoint_interrupt(struct dwc3 *dwc,
>  {
>         struct dwc3_ep          *dep;
>         u8                      epnum = event->endpoint_number;
> +       u8                      cmd;
>
>         dep = dwc->eps[epnum];
>
> @@ -2200,8 +2208,13 @@ static void dwc3_endpoint_interrupt(struct dwc3 *dwc,
>                         return;
>                 }
>                 break;
> -       case DWC3_DEPEVT_RXTXFIFOEVT:
>         case DWC3_DEPEVT_EPCMDCMPLT:
> +               cmd = DEPEVT_PARAMETER_CMD(event->parameters);
> +
> +               if (cmd == DWC3_DEPCMD_ENDTRANSFER)
> +                       dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;

I think you missed wake_up(&dep->wait_end_transfer) function here.

> +               break;
> +       case DWC3_DEPEVT_RXTXFIFOEVT:
>                 break;
>         }
>  }
> @@ -2246,6 +2259,7 @@ static void dwc3_reset_gadget(struct dwc3 *dwc)
>  }
>
>  static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum, bool force)
> +__releases(&dwc->lock) __acquires(&dwc->lock)
>  {
>         struct dwc3_ep *dep;
>         struct dwc3_gadget_ep_cmd_params params;
> @@ -2295,11 +2309,22 @@ static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum, bool force)
>         memset(&params, 0, sizeof(params));
>         ret = dwc3_send_gadget_ep_cmd(dep, cmd, &params);
>         WARN_ON_ONCE(ret);
> +
> +       if (dwc3_is_usb31(dwc) || dwc->revision < DWC3_REVISION_310A) {
> +               dep->flags |= DWC3_EP_END_TRANSFER_PENDING;
> +               wait_event_lock_irq(dep->wait_end_transfer,
> +                               !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
> +                               dwc->lock);
> +       }

I've tested your patch, unfortunately it can not work on my platform.
Since we can not issue wait_event_lock_irq() function in atomic
context, we will hold another 'cdev->lock' in composite_disconnect()
function. The dump stack as below:

[   92.686810] c7 BUG: scheduling while atomic: init/1/0x00000002
[   92.686825] c7 INFO: lockdep is turned off.
[   92.686833] c0 Modules linked in: mtty marlin2_fm mali_kbase(O)
[   92.686859] c0 Preemption disabled at:[<ffffffc00064cea8>]
composite_disconnect+0x30/0x90
[   92.686879] c7
[   92.686891] c7 CPU: 7 PID: 1 Comm: init Tainted: G        W  O    4.4.6+ #20
[   92.686898] c7 Hardware name: Spreadtrum SP9860g Board (DT)
[   92.686905] c7 Call trace:
[   92.689519] c7 [<ffffffc00008b538>] dump_backtrace+0x0/0x170
[   92.689528] c7 [<ffffffc00008b6c8>] show_stack+0x20/0x28
[   92.689537] c7 [<ffffffc000424114>] dump_stack+0xa8/0xe0
[   92.689547] c7 [<ffffffc0000fa7d8>] __schedule_bug+0x7c/0xd4
[   92.689559] c7 [<ffffffc000aa5d20>] __schedule+0x724/0xab4
[   92.689567] c7 [<ffffffc000aa62f8>] schedule+0x48/0xa8
[   92.689578] c7 [<ffffffc000625fb8>]
dwc3_stop_active_transfer.constprop.27+0x98/0x158
[   92.689587] c7 [<ffffffc000626c88>] dwc3_remove_requests+0x3c/0xa0
[   92.689595] c7 [<ffffffc000627d08>] __dwc3_gadget_ep_disable+0x4c/0x12c
[   92.689603] c7 [<ffffffc000628188>] dwc3_gadget_ep_disable+0x6c/0x100
[   92.689613] c7 [<ffffffc0006708b4>] mtp_function_disable+0xc0/0x100
[   92.689624] c7 [<ffffffc00064bfa0>] reset_config+0x4c/0x94
[   92.689632] c7 [<ffffffc00064cebc>] composite_disconnect+0x44/0x90
[   92.689641] c7 [<ffffffc0006509c4>] android_disconnect+0x60/0x70
[   92.689649] c7 [<ffffffc000653080>] usb_gadget_remove_driver+0x6c/0xe0
[   92.689657] c7 [<ffffffc000653170>] usb_gadget_unregister_driver+0x7c/0xc8
[   92.689665] c7 [<ffffffc000651adc>] gadget_dev_desc_UDC_store+0x8c/0x128
[   92.689675] c7 [<ffffffc0002c263c>] configfs_write_file+0xd0/0x13c
[   92.689684] c7 [<ffffffc00023cb14>] vfs_write+0xb8/0x214
[   92.689692] c7 [<ffffffc00023d594>] SyS_write+0x54/0xb0
[   92.689701] c7 [<ffffffc000085ff0>] el0_svc_naked+0x24/0x28

> +
>         dep->resource_index = 0;
>         dep->flags &= ~DWC3_EP_BUSY;
>
> -       if (dwc3_is_usb31(dwc) || dwc->revision < DWC3_REVISION_310A)
> -               udelay(100);
> +       if (dep->flags & DWC3_EP_KICK_POST_END_TRANSFER) {
> +               dep->flags &= ~DWC3_EP_KICK_POST_END_TRANSFER;
> +               ret = __dwc3_gadget_kick_transfer(dep, 0);
> +               WARN_ON_ONCE(ret);
> +       }
>  }
>
>  static void dwc3_stop_active_transfers(struct dwc3 *dwc)
> --
> 2.10.0.440.g21f862b
>
>
> --
> balbi



-- 
Baolin.wang
Best Regards

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ