lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 3 Jan 2023 20:34:55 -0700
From:   Jeffrey Hugo <jeffrey.l.hugo@...il.com>
To:     Baochen Qiang <quic_bqiang@...cinc.com>
Cc:     manivannan.sadhasivam@...aro.org, linux-arm-msm@...r.kernel.org,
        linux-kernel@...r.kernel.org, ath11k@...ts.infradead.org,
        mhi@...ts.linux.dev
Subject: Re: [PATCH] bus: mhi: host: Change the log levels for SYS_ERR event

So your firmware is glitching, but it isn't kicking you to the RDDM
EE?  RDDM EE triggers an entirely different code path than the syserr
process. If that assumption is correct, then I'm not entirely sure why
that check exists since the current code would do the syserr
processing only if it's not a RDDM event.

There are other reasons the FW might trigger syserr, which would not
be a fatal error or rddm, and that should trigger both the
controller's processing as well as the MHI core.

If I were to guess, I would say that Hemanth and Bhaumik had that in
there because they were concerned about the RDDM processing triggering
multiple times.  I don't see how RDDM processing can trigger without
the RDDM EE, which seems to make that concern moot.  Sadly, I can no
longer ask them to confirm.

Have you experimented with removing that check?  That seems like a
valid fix for your system.  What you propose is bypassing the dynamic
debug mechanism, which doesn't seem justified in this case from what
I've seen so far.

-Jeff

On Tue, Jan 3, 2023 at 8:17 PM Baochen Qiang <quic_bqiang@...cinc.com> wrote:
>
>
> On 1/4/2023 11:11 AM, Jeffrey Hugo wrote:
> > On Tue, Jan 3, 2023 at 7:57 PM Baochen Qiang <quic_bqiang@...cinc.com> wrote:
> >>
> >> On 1/4/2023 10:41 AM, Jeffrey Hugo wrote:
> >>> Why was this not sent to the MHI mailing list?
> >> I don't know the MHI mailing list address, could tell me that?
> > The relevant entry from MAINTAINERS -
> >
> > MHI BUS
> > M: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
> > L: mhi@...ts.linux.dev
> > L: linux-arm-msm@...r.kernel.org
> > S: Maintained
> > T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git
> > F: Documentation/ABI/stable/sysfs-bus-mhi
> > F: Documentation/mhi/
> > F: drivers/bus/mhi/
> > F: include/linux/mhi.h
> >
> >>> On Tue, Jan 3, 2023 at 7:19 PM Baochen Qiang <quic_bqiang@...cinc.com> wrote:
> >>>> Currently no log printed when SYS_ERR happens, this makes
> >>>> debug quite hard, so change log level to make it noisy.
> >>> You are going to need to explain this more.
> >>> There are two drivers in the upstream kernel that are MHI clients -
> >>> pci_generic and ath11k.
> >>> I'm assuming that you care about ath11k because you included that mail list.
> >> Yes, I am talking about ath11k.
> >>> In ath11k_mhi_op_status_cb() I see a warning message printed when the
> >>> syserr callback is triggered.
> >>> I see something similar in pci_generic.
> >>>
> >>> Looks like a log is printed when SYS_ERR happens in all possible
> >>> scenarios, so I don't understand the point of this change.
> >>> Particularly given that dev_dbg messages can be trivially enabled.
> >>>
> >>> -Jeff
> >> Well, this is not true in some cases. For example, we have met cases where
> >>
> >> WLAN HW/firmware is not working well, and only send a SYS_ERR event to MHI
> >>
> >> driver, however this event is not sent to ath11k host becuase of
> >> mhi_pm_sys_err_handler(),
> >>
> >> so we got no log at all.
> >>
> > With the 6.1 kernel?
> >
> > mhi_pm_sys_err_handler() queues the st_worker.
> >
> > mhi_pm_st_worker() , which is the st_worker function, calls
> > mhi_pm_sys_error_transition() in the DEV_ST_TRANSITION_SYS_ERR case
> > (we are processing a SYS_ERR).
> >
> > Pretty much the first thing mhi_pm_sys_err_transition() does is this -
> >
> > /* We must notify MHI control driver so it can clean up first */
> > mhi_cntrl->status_cb(mhi_cntrl, MHI_CB_SYS_ERROR);
> >
> > Which calls the ath11k driver ath11k_mhi_op_status_cb() I mentioned earlier.
> >
> > -Jeff
>
>
> No, mhi_pm_sys_err_handler() will NOT queue the st_worker because ath11k
> host supports RDDM, so the SYS_ERR event will be skipped. See below log:
>
> kernel: [  165.393720] mhi mhi0: local ee: MISSION MODE state: M0 device
> ee: RAMDUMP DOWNLOAD MODE state: M0
> kernel: [  165.401820] mhi mhi0: State change event to state: SYS ERROR
> kernel: [  165.401824] mhi mhi0: System error detected
> kernel: [  165.401827] mhi mhi0: Controller supports RDDM, skip SYS_ERROR
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ