[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210615070747.GB31646@jackp-linux.qualcomm.com>
Date: Tue, 15 Jun 2021 00:07:47 -0700
From: Jack Pham <jackp@...eaurora.org>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
Shuah Khan <shuah@...nel.org>,
Florian Fainelli <f.fainelli@...il.com>, patches@...nelci.org,
lkft-triage@...ts.linaro.org, Jon Hunter <jonathanh@...dia.com>,
linux-stable <stable@...r.kernel.org>,
Pavel Machek <pavel@...x.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Guenter Roeck <linux@...ck-us.net>, linux-usb@...r.kernel.org,
Peter Chen <peter.chen@...nel.org>,
Felipe Balbi <balbi@...nel.org>
Subject: Re: [PATCH 5.10 000/130] 5.10.44-rc2 review
Hi Greg,
On Tue, Jun 15, 2021 at 08:05:50AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Jun 15, 2021 at 09:41:26AM +0530, Naresh Kamboju wrote:
> > On Mon, 14 Jun 2021 at 21:45, Greg Kroah-Hartman
> > <gregkh@...uxfoundation.org> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.10.44 release.
> > > There are 130 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Wed, 16 Jun 2021 16:13:59 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.44-rc2.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > The following kernel crash reported on stable rc 5.10.44-rc2 arm64 db845c board.
> >
> > [ 5.127966] dwc3-qcom a6f8800.usb: failed to get usb-ddr path: -517
Looks like -EPROBE_DEFER happened here due to a not-yet-probed
dependency (interconnect driver). This leads to dwc3_qcom_probe()
unwinding and calling of_platform_depopulate() which triggers the
"child" dwc3's driver remove callback dwc3_remove()...
> > [ 5.145567] Unable to handle kernel NULL pointer dereference at
> > virtual address 0000000000000002
> > [ 5.154451] Mem abort info:
> > [ 5.157296] ESR = 0x96000004
> > [ 5.160401] EC = 0x25: DABT (current EL), IL = 32 bits
> > [ 5.165771] SET = 0, FnV = 0
> > [ 5.168873] EA = 0, S1PTW = 0
> > [ 5.172064] Data abort info:
> > [ 5.174980] ISV = 0, ISS = 0x00000004
> > [ 5.178860] CM = 0, WnR = 0
> > [ 5.181872] [0000000000000002] user address but active_mm is swapper
> > [ 5.188293] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> > [ 5.193922] Modules linked in:
> > [ 5.197022] CPU: 4 PID: 57 Comm: kworker/4:3 Not tainted 5.10.44-rc2 #1
> > [ 5.203697] Hardware name: Thundercomm Dragonboard 845c (DT)
> > [ 5.204022] ufshcd-qcom 1d84000.ufshc: ufshcd_print_pwr_info:[RX,
> > TX]: gear=[3, 3], lane[2, 2], pwr[FAST MODE, FAST MODE], rate = 2
> > [ 5.209434] Workqueue: events deferred_probe_work_func
> > [ 5.221786] ufshcd-qcom 1d84000.ufshc:
> > ufshcd_find_max_sup_active_icc_level: Regulator capability was not
> > set, actvIccLevel=0
> > [ 5.226541] pstate: 60c00005 (nZCv daif +PAN +UAO -TCO BTYPE=--)
> > [ 5.226551] pc : inode_permission+0x2c/0x178
> > [ 5.226559] lr : lookup_one_len_common+0xac/0x100
> >
> > ref:
> > https://lkft.validation.linaro.org/scheduler/job/2899138#L2873
> >
> > Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
> >
> > There is a crash like this reported and discussed on the mailing thread.
> > https://lore.kernel.org/linux-usb/20210608105656.10795-1-peter.chen@kernel.org/
>
> Is this crash just on shutdown? That's what that commit was fixing, but
> it is resolving an error that should not be in the 5.10.y tree.
Peter reported and fixed it based on reproducing the crash from shutting
down but in my manual testing I found that it could be triggered any
time dwc3_remove() is called, though I surmised it would be a rare
occurence. In this particular case however Naresh is reporting it is
triggered even during bootup since dwc3-qcom would add its
dwc3 child, but because it encounters a probe deferral it has to
subsequently trigger the dwc3 driver remove callback right after it was
just probed.
So I think it would be good if Peter's follow-up change
(2a042767814b in your usb-next branch) can please go into stable as well
as it should help not only for the shutdown/reboot case. Otherwise,
my change "usb: dwc3: debugfs: Add and remove endpoint dirs
dynamically" could be simply be dropped until they can go in together.
Thanks,
Jack
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Powered by blists - more mailing lists