lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e2cebf65-012d-f818-8202-eb511c996e28@linaro.org>
Date:   Thu, 29 Jul 2021 21:28:49 +0100
From:   Caleb Connolly <caleb.connolly@...aro.org>
To:     Rob Clark <robdclark@...il.com>
Cc:     dri-devel <dri-devel@...ts.freedesktop.org>,
        freedreno <freedreno@...ts.freedesktop.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>,
        Rob Clark <robdclark@...omium.org>,
        Sean Paul <sean@...rly.run>, David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        Jordan Crouse <jordan@...micpenguin.net>,
        Jonathan Marek <jonathan@...ek.ca>,
        Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Sharat Masetty <smasetty@...eaurora.org>,
        Akhil P Oommen <akhilpo@...eaurora.org>,
        open list <linux-kernel@...r.kernel.org>,
        Stephen Boyd <sboyd@...nel.org>
Subject: Re: [PATCH] drm/msm: Disable frequency clamping on a630



On 29/07/2021 21:24, Rob Clark wrote:
> On Thu, Jul 29, 2021 at 1:06 PM Caleb Connolly
> <caleb.connolly@...aro.org> wrote:
>>
>> Hi Rob,
>>
>> I've done some more testing! It looks like before that patch ("drm/msm: Devfreq tuning") the GPU would never get above
>> the second frequency in the OPP table (342MHz) (at least, not in glxgears). With the patch applied it would more
>> aggressively jump up to the max frequency which seems to be unstable at the default regulator voltages.
> 
> *ohh*, yeah, ok, that would explain it
> 
>> Hacking the pm8005 s1 regulator (which provides VDD_GFX) up to 0.988v (instead of the stock 0.516v) makes the GPU stable
>> at the higher frequencies.
>>
>> Applying this patch reverts the behaviour, and the GPU never goes above 342MHz in glxgears, losing ~30% performance in
>> glxgear.
>>
>> I think (?) that enabling CPR support would be the proper solution to this - that would ensure that the regulators run
>> at the voltage the hardware needs to be stable.
>>
>> Is hacking the voltage higher (although ideally not quite that high) an acceptable short term solution until we have
>> CPR? Or would it be safer to just not make use of the higher frequencies on a630 for now?
>>
> 
> tbh, I'm not sure about the regulator stuff and CPR.. Bjorn is already
> on CC and I added sboyd, maybe one of them knows better.
> 
> In the short term, removing the higher problematic OPPs from dts might
> be a better option than this patch (which I'm dropping), since there
> is nothing stopping other workloads from hitting higher OPPs.
Oh yeah that sounds like a more sensible workaround than mine 😅.
> 
> I'm slightly curious why I didn't have problems at higher OPPs on my
> c630 laptop (sdm850)
Perhaps you won the sillicon lottery - iirc sdm850 is binned for higher clocks as is out of the factory.

Would it be best to drop the OPPs for all devices? Or just those affected? I guess it's possible another c630 might 
crash where yours doesn't?
> 
> BR,
> -R
> 
>>
>> On 29/07/2021 19:39, Rob Clark wrote:
>>> From: Rob Clark <robdclark@...omium.org>
>>>
>>> The more frequent frequency transitions resulting from clamping freq to
>>> minimum when the GPU is idle seems to be causing some issue with the bus
>>> getting voted off when it should be on.  (An enable racing with an async
>>> disable?)  This might be a problem outside of the GPU, as I can't
>>> reproduce this on a618 which uses the same GMU fw and same mechanism to
>>> communicate with GMU to set opp.  For now, just revert to previous
>>> devfreq behavior on a630 until the issue is understood.
>>>
>>> Reported-by: Caleb Connolly <caleb.connolly@...aro.org>
>>> Fixes: 9bc95570175a ("drm/msm: Devfreq tuning")
>>> Signed-off-by: Rob Clark <robdclark@...omium.org>
>>> ---
>>>    drivers/gpu/drm/msm/adreno/adreno_gpu.c |  3 +++
>>>    drivers/gpu/drm/msm/msm_gpu.h           |  2 ++
>>>    drivers/gpu/drm/msm/msm_gpu_devfreq.c   | 12 ++++++++++++
>>>    3 files changed, 17 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>>> index 748665232d29..9fd08b413010 100644
>>> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>>> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
>>> @@ -945,6 +945,9 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
>>>        pm_runtime_use_autosuspend(dev);
>>>        pm_runtime_enable(dev);
>>>
>>> +     if (adreno_is_a630(adreno_gpu))
>>> +             gpu->devfreq.disable_freq_clamping = true;
>>> +
>>>        return msm_gpu_init(drm, pdev, &adreno_gpu->base, &funcs->base,
>>>                        adreno_gpu->info->name, &adreno_gpu_config);
>>>    }
>>> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
>>> index 0e4b45bff2e6..7e11b667f939 100644
>>> --- a/drivers/gpu/drm/msm/msm_gpu.h
>>> +++ b/drivers/gpu/drm/msm/msm_gpu.h
>>> @@ -112,6 +112,8 @@ struct msm_gpu_devfreq {
>>>         * it is inactive.
>>>         */
>>>        unsigned long idle_freq;
>>> +
>>> +     bool disable_freq_clamping;
>>>    };
>>>
>>>    struct msm_gpu {
>>> diff --git a/drivers/gpu/drm/msm/msm_gpu_devfreq.c b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
>>> index 0a1ee20296a2..a832af436251 100644
>>> --- a/drivers/gpu/drm/msm/msm_gpu_devfreq.c
>>> +++ b/drivers/gpu/drm/msm/msm_gpu_devfreq.c
>>> @@ -94,6 +94,12 @@ void msm_devfreq_init(struct msm_gpu *gpu)
>>>        if (!gpu->funcs->gpu_busy)
>>>                return;
>>>
>>> +     /* Revert to previous polling interval if we aren't using freq clamping
>>> +      * to preserve previous behavior
>>> +      */
>>> +     if (gpu->devfreq.disable_freq_clamping)
>>> +             msm_devfreq_profile.polling_ms = 10;
>>> +
>>>        msm_devfreq_profile.initial_freq = gpu->fast_rate;
>>>
>>>        /*
>>> @@ -151,6 +157,9 @@ void msm_devfreq_active(struct msm_gpu *gpu)
>>>        unsigned int idle_time;
>>>        unsigned long target_freq = df->idle_freq;
>>>
>>> +     if (gpu->devfreq.disable_freq_clamping)
>>> +             return;
>>> +
>>>        /*
>>>         * Hold devfreq lock to synchronize with get_dev_status()/
>>>         * target() callbacks
>>> @@ -186,6 +195,9 @@ void msm_devfreq_idle(struct msm_gpu *gpu)
>>>        struct msm_gpu_devfreq *df = &gpu->devfreq;
>>>        unsigned long idle_freq, target_freq = 0;
>>>
>>> +     if (gpu->devfreq.disable_freq_clamping)
>>> +             return;
>>> +
>>>        /*
>>>         * Hold devfreq lock to synchronize with get_dev_status()/
>>>         * target() callbacks
>>>
>>
>> --
>> Kind Regards,
>> Caleb (they/them)

-- 
Kind Regards,
Caleb (they/them)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ