[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=XNTgzccjkQOnuTcYtaUK+ZRU1DbqYdnNOOD+TrVGn9xA@mail.gmail.com>
Date: Fri, 18 Oct 2024 08:22:36 -0700
From: Doug Anderson <dianders@...omium.org>
To: George-Daniel Matei <danielgeorgem@...omium.org>
Cc: "David S. Miller" <davem@...emloft.net>, Hayes Wang <hayeswang@...ltek.com>,
Heiner Kallweit <hkallweit1@...il.com>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Grant Grundler <grundler@...omium.org>, linux-usb@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] r8152: fix deadlock in usb reset during resume
Hi,
On Fri, Oct 18, 2024 at 7:13 AM George-Daniel Matei
<danielgeorgem@...omium.org> wrote:
>
> rtl8152_system_resume() issues a synchronous usb reset if the device is
> inaccessible. __rtl8152_set_mac_address() is called via
> rtl8152_post_reset() and it tries to take the same mutex that was already
> taken in rtl8152_resume().
Thanks for the fix! I'm 99% certain I tested the original code, but I
guess somehow I ran a different code path. I just put my old hacky
test patch [1] back on and re-tested this to see what happened. OK, I
see. In my case dev_set_mac_address() gets called at resume time but
then the address hasn't changed so "ops->ndo_set_mac_address()" (which
points to rtl8152_set_mac_address()) never gets called and I don't end
up in the deadlock. I wonder why the MAC address changed for you. In
any case, the deadlock is real and I agree that this should be fixed.
BTW: it would be handy to include the call stack of the deadlock in
your commit message.
[1] https://crrev.com/c/5543125
> Move the call to reset usb in rtl8152_resume()
> outside mutex protection.
>
> Signed-off-by: George-Daniel Matei <danielgeorgem@...omium.org>
Before your Signed-off-by you should have:
Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset")
> ---
> drivers/net/usb/r8152.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index a5612c799f5e..69d66ce7a5c5 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -8564,19 +8564,6 @@ static int rtl8152_system_resume(struct r8152 *tp)
> usb_submit_urb(tp->intr_urb, GFP_NOIO);
> }
>
> - /* If the device is RTL8152_INACCESSIBLE here then we should do a
> - * reset. This is important because the usb_lock_device_for_reset()
> - * that happens as a result of usb_queue_reset_device() will silently
> - * fail if the device was suspended or if too much time passed.
> - *
> - * NOTE: The device is locked here so we can directly do the reset.
> - * We don't need usb_lock_device_for_reset() because that's just a
> - * wrapper over device_lock() and device_resume() (which calls us)
> - * does that for us.
> - */
> - if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
> - usb_reset_device(tp->udev);
> -
> return 0;
> }
>
> @@ -8681,6 +8668,19 @@ static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message)
>
> mutex_unlock(&tp->control);
>
> + /* If the device is RTL8152_INACCESSIBLE here then we should do a
> + * reset. This is important because the usb_lock_device_for_reset()
> + * that happens as a result of usb_queue_reset_device() will silently
> + * fail if the device was suspended or if too much time passed.
> + *
> + * NOTE: The device is locked here so we can directly do the reset.
> + * We don't need usb_lock_device_for_reset() because that's just a
> + * wrapper over device_lock() and device_resume() (which calls us)
> + * does that for us.
> + */
> + if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
> + usb_reset_device(tp->udev);
You seem to have moved this to the wrong function. It should be in
rtl8152_resume() but you've moved it to rtl8152_suspend(). As you have
it here you'll avoid the deadlock but I fear you may end up missing a
reset. Maybe you didn't notice this because commit 8c1d92a740c0
("r8152: Wake up the system if the we need a reset") woke us up
quickly enough and the previous reset hadn't expired yet?
In any case, please move it to the rtl8152_resume() function, re-test,
and post a new version.
-Doug
Powered by blists - more mailing lists