[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <864j2couqu.wl-maz@kernel.org>
Date: Mon, 06 Jan 2025 14:57:29 +0000
From: Marc Zyngier <maz@...nel.org>
To: Sibi Sankar <quic_sibis@...cinc.com>
Cc: Johan Hovold <johan@...nel.org>,
<sudeep.holla@....com>,
<cristian.marussi@....com>,
<andersson@...nel.org>,
<konrad.dybcio@...aro.org>,
<robh+dt@...nel.org>,
<krzysztof.kozlowski+dt@...aro.org>,
<dmitry.baryshkov@...aro.org>,
<linux-kernel@...r.kernel.org>,
<linux-arm-msm@...r.kernel.org>,
<devicetree@...r.kernel.org>,
<quic_rgottimu@...cinc.com>,
<quic_kshivnan@...cinc.com>,
<conor+dt@...nel.org>,
<quic_nkela@...cinc.com>,
<quic_psodagud@...cinc.com>,
<abel.vesa@...aro.org>
Subject: Re: [PATCH V7 0/2] qcom: x1e80100: Enable CPUFreq
On Mon, 06 Jan 2025 12:22:48 +0000,
Sibi Sankar <quic_sibis@...cinc.com> wrote:
>
>
>
> On 12/5/24 21:16, Johan Hovold wrote:
> > On Thu, Dec 05, 2024 at 04:53:05PM +0530, Sibi Sankar wrote:
> >> On 11/5/24 23:42, Marc Zyngier wrote:
> >>> On Tue, 05 Nov 2024 16:57:07 +0000,
> >>> Johan Hovold <johan@...nel.org> wrote:
> >>>> On Fri, Nov 01, 2024 at 02:43:57PM +0000, Marc Zyngier wrote:
> >
> >>>>> I wonder whether the same sort of reset happen on more "commercial"
> >>>>> systems (such as some of the laptops). You expect that people look at
> >>>>> the cpufreq stuff closely, and don't see things exploding like we are.
> >>>>
> >>>> I finally got around to getting my Lenovo ThinkPad T14s to boot (it
> >>>> refuses to start the kernel when using GRUB, and it's not due to the
> >>>> known 64 GB memory issue as it only has 32 GB)
> >>>
> >>> <cry>
> >>> I know the feeling. My devkit can't use GRUB either, so I added a
> >>> hook to the GRUB config to generate EFI scripts that directly execute
> >>> the kernel with initrd, dtb, and command line.
> >>>
> >>> This is probably the worse firmware I've seen in a very long while.
> >>
> >> The PERF_LEVEL_GET implementation in the SCP firmware side
> >> is the reason for the crash :|, currently there is a bug
> >> in the kernel that picks up index that we set with LEVEL_SET
> >> with fast channel and that masks the crash. I was told the
> >> crash happens when idle states are enabled and a regular
> >> LEVEL_GET message is triggered from the kernel. This was
> >> fixed a while back but it will take a while to flow back
> >> to all the devices. It should already be out CRD's.
> >>
> >> Johan,
> >> Now that you are aware of the the limitations can we make
> >> a call on how to deal with this and land cpufreq?
> >
> > As Marc said, it seems you need to come up with a way to detect and work
> > around the broken firmware.
>
> The perf protocol version won't have any changes so detecting
> it isn't possible :(
This is just... baffling. Can this be checked against one of the
strings contained in the DMI tables?
>
> >
> > We want to get the fast channel issue fixed, but when we merge that fix
> > it will trigger these crashes if we also merge cpufreq support for x1e.
> >
> > Can you expand the on the PERF_LEVEL_GET issue? Is it possible to
> > implement some workaround for the buggy firmware? Like returning a dummy
> > value? How exactly are things working today? Can't that be used a basis
> > for a quirk?
>
> The main problem is the X1E firmware supports fast channel level get
> but when queried it says it doesn't support it :|. The PERF_LEVEL_GET
> regular messaging which gets used as a fallback has a bug which causes
> the device to crash. So we either enable cpufreq only on platforms
> that has the fix in place
Again: how do we detect this?
> or live with the warning that certain messages
> don't support fast channel which I don't think will fly. I've also been
> told the crash wouldn't show up if we have all sleep states
> disabled.
So we have the choice between crashing quickly, or sucking power like
mad?
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists