[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87wmf7ahc3.fsf@oltmanns.dev>
Date: Mon, 06 Jan 2025 20:10:52 +0100
From: Frank Oltmanns <frank@...manns.dev>
To: Stephan Gerhold <stephan.gerhold@...aro.org>
Cc: Johan Hovold <johan+linaro@...nel.org>, Dmitry Baryshkov
<dmitry.baryshkov@...aro.org>, Bjorn Andersson <andersson@...nel.org>,
Konrad Dybcio <konradybcio@...nel.org>, Chris Lew
<quic_clew@...cinc.com>, Abel Vesa <abel.vesa@...aro.org>,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
regressions@...ts.linux.dev, stable@...r.kernel.org
Subject: Re: [PATCH] soc: qcom: mark pd-mapper as broken
On 2024-10-11 at 12:01:48 +0200, Stephan Gerhold <stephan.gerhold@...aro.org> wrote:
> On Thu, Oct 10, 2024 at 09:42:46AM +0200, Johan Hovold wrote:
>> When using the in-kernel pd-mapper on x1e80100, client drivers often
>> fail to communicate with the firmware during boot, which specifically
>> breaks battery and USB-C altmode notifications. This has been observed
>> to happen on almost every second boot (41%) but likely depends on probe
>> order:
>>
>> pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to send altmode request: 0x10 (-125)
>> pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to request altmode notifications: -125
>>
>> ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: failed to send UCSI read request: -125
>>
>> qcom_battmgr.pmic_glink_power_supply pmic_glink.power-supply.0: failed to request power notifications
>>
>> In the same setup audio also fails to probe albeit much more rarely:
>>
>> PDR: avs/audio get domain list txn wait failed: -110
>> PDR: service lookup for avs/audio failed: -110
>>
>> Chris Lew has provided an analysis and is working on a fix for the
>> ECANCELED (125) errors, but it is not yet clear whether this will also
>> address the audio regression.
>>
>> Even if this was first observed on x1e80100 there is currently no reason
>> to believe that these issues are specific to that platform.
>>
>> Disable the in-kernel pd-mapper for now, and make sure to backport this
>> to stable to prevent users and distros from migrating away from the
>> user-space service.
>>
>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
>> Cc: stable@...r.kernel.org # 6.11
>> Link: https://lore.kernel.org/lkml/Zqet8iInnDhnxkT9@hovoldconsulting.com/
>> Signed-off-by: Johan Hovold <johan+linaro@...nel.org>
>> ---
>>
>> It's now been over two months since I reported this regression, and even
>> if we seem to be making some progress on at least some of these issues I
>> think we need disable the pd-mapper temporarily until the fixes are in
>> place (e.g. to prevent distros from dropping the user-space service).
>>
>
> This is just a random thought, but I wonder if we could insert a delay
> somewhere as temporary workaround to make the in-kernel pd-mapper more
> reliable. I just tried replicating the userspace pd-mapper timing on
> X1E80100 CRD by:
>
> 1. Disabling auto-loading of qcom_pd_mapper
> (modprobe.blacklist=qcom_pd_mapper)
> 2. Adding a systemd service that does nothing except running
> "modprobe qcom_pd_mapper" at the same point in time where the
> userspace pd-mapper would usually be started.
Thank you so much for this idea. I'm currently using this workaround on
my sdm845 device (where the in-kernel pd-mapper is breaking the
out-of-tree call audio functionality).
Is there any work going on on making the timing of the in-kernel
pd-mapper more reliable?
Cheers,
Frank
> This seems to work quite well for me, I haven't seen any of the
> mentioned errors anymore in a couple of boot tests. Clearly, there is no
> actual bug in the in-kernel pd-mapper, only worse timing.
>
> Thanks,
> Stephan
Powered by blists - more mailing lists