lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Jul 2020 12:51:44 -0700
From:   Stephen Boyd <swboyd@...omium.org>
To:     Lina Iyer <ilina@...eaurora.org>
Cc:     Rajendra Nayak <rnayak@...eaurora.org>,
        Stanimir Varbanov <stanimir.varbanov@...aro.org>,
        robh+dt@...nel.org, agross@...nel.org, bjorn.andersson@...aro.org,
        linux-arm-msm@...r.kernel.org, linux-media@...r.kernel.org,
        devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
        mka@...omium.org, Maulik Shah <mkshah@...eaurora.org>
Subject: Re: [PATCH v4 4/5] arm64: dts: sdm845: Add OPP tables and power-domains for venus

Quoting Lina Iyer (2020-07-28 09:52:12)
> On Mon, Jul 27 2020 at 18:45 -0600, Stephen Boyd wrote:
> >Quoting Lina Iyer (2020-07-24 09:28:25)
> >> On Fri, Jul 24 2020 at 03:03 -0600, Rajendra Nayak wrote:
> >> >Hi Maulik/Lina,
> >> >
> >> >On 7/23/2020 11:36 PM, Stanimir Varbanov wrote:
> >> >>Hi Rajendra,
> >> >>
> >> >>After applying 2,3 and 4/5 patches on linaro-integration v5.8-rc2 I see
> >> >>below messages on db845:
> >> >>
> >> >>qcom-venus aa00000.video-codec: dev_pm_opp_set_rate: failed to find
> >> >>current OPP for freq 533000097 (-34)
> >> >>
> >> >>^^^ This one is new.
> >> >>
> >> >>qcom_rpmh TCS Busy, retrying RPMH message send: addr=0x30000
> >> >>
> >> >>^^^ and this message is annoying, can we make it pr_debug in rpmh?
> >> >
> >> How annoyingly often do you see this message?
> >> Usually, this is an indication of bad system state either on remote
> >> processors in the SoC or in Linux itself. On a smooth sailing build you
> >> should not see this 'warning'.
> >>
> >> >Would you be fine with moving this message to a pr_debug? Its currently
> >> >a pr_info_ratelimited()
> >> I would rather not, moving this out of sight will mask a lot serious
> >> issues that otherwise bring attention to the developers.
> >>
> >
> >I removed this warning message in my patch posted to the list[1]. If
> >it's a serious problem then I suppose a timeout is more appropriate, on
> >the order of several seconds or so and then a pr_warn() and bail out of
> >the async call with an error.
> >
> The warning used to capture issues that happen within a second and it
> helps capture system related issues. Timing out after many seconds
> overlooks the system issues that generally tend to resolve itself, but
> nevertheless need to be investigated.
> 

Is it correct to read "system related issues" as performance problems
where the thread is spinning forever trying to send a message and it
can't? So the problem is mostly that it's an unbounded amount of time
before the message is sent to rpmh and this printk helps identify those
situations where that is happening?

Otherwise as you say above it's a bad system state where the rpmh
processor has gotten into a bad state like a crash? Can we recover from
that? Or is the only recovery a reboot of the system? Does the rpmh
processor reboot the system if it crashes?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ