lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z2ApCU3DAxKS7Y9k@hovoldconsulting.com>
Date: Mon, 16 Dec 2024 14:20:09 +0100
From: Johan Hovold <johan@...nel.org>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Cc: mhi@...ts.linux.dev, linux-arm-msm@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Loic Poulain <loic.poulain@...aro.org>
Subject: Re: mhi resume failure on reboot with 6.13-rc2

On Mon, Dec 16, 2024 at 01:10:21PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Dec 11, 2024 at 04:03:59PM +0100, Johan Hovold wrote:
> > On Wed, Dec 11, 2024 at 08:23:15PM +0530, Manivannan Sadhasivam wrote:
> > > On Wed, Dec 11, 2024 at 03:17:22PM +0100, Johan Hovold wrote:
> > 
> > > > I just hit the following modem related error on reboot of the x1e80100
> > > > CRD for the second time with 6.13-rc2:
> > > > 
> > > > 	[  138.348724] shutdown[1]: Rebooting.
> > > >         [  138.545683] arm-smmu 3da0000.iommu: disabling translation
> > > >         [  138.582505] mhi mhi0: Resuming from non M3 state (SYS ERROR)
> > > >         [  138.588516] mhi-pci-generic 0005:01:00.0: failed to resume device: -22
> > > >         [  138.595375] mhi-pci-generic 0005:01:00.0: device recovery started
> > > >         [  138.603841] wwan wwan0: port wwan0qcdm0 disconnected
> > > >         [  138.609508] wwan wwan0: port wwan0mbim0 disconnected
> > > >         [  138.615137] wwan wwan0: port wwan0qmi0 disconnected
> > > >         [  138.702604] mhi mhi0: Requested to power ON
> > > >         [  139.027494] mhi mhi0: Power on setup success
> > > >         [  139.027640] mhi mhi0: Wait for device to enter SBL or Mission mode
> > > > 
> > > > and then the machine hangs.

> Could be. But the issue seems to be stemming from the modem crash while exiting
> M3. You can try removing the modem autosuspend by skipping the if condition
> block:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/bus/mhi/host/pci_generic.c?h=v6.13-rc1#n1184
> 
> If you no longer see the crash, then the issue might be with modem not coping
> up with autosuspend. If you still see the crash, then something else going wrong
> during reboot/power off.

I've only hit this issue three times and only since 6.13-rc2. So not
sure how useful that sort of experiment would be.

> > Is there anything you can do on the mhi side to prevent it from blocking
> > reboot/power off?
> 
> It should not block the reboot/power off forever. There is a timeout waiting for
> SBL/Mission mode and the max time is 24s (depending on the modem). Can you share
> the modem VID:PID?

I just hit the issue again and can confirm that it does block
reboot/shutdown forever (I've been waiting for 20 minutes now).

Judging from a quick look at the code, "Wait for device to enter SBL or
Mission mode" is printed by mhi_fw_load_handler(), which in turn is only
called from the mhi_pm_st_worker() state machine.

I can't seem to find anything that makes sure that the next state is
ever reached, so regardless of the cause of the modem fw crash (if
that's what it is) the hung reboot appears to be a bug in mhi.

This is with the SDX65 modem in the x1e80100 CRD:
	
	17cb:0308

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ