lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2eff49b1-f0ad-8d44-5902-93937867178b@linaro.org>
Date:   Thu, 30 Mar 2023 17:52:31 +0300
From:   Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To:     Konrad Dybcio <konrad.dybcio@...aro.org>
Cc:     Johan Hovold <johan@...nel.org>, linux-arm-msm@...r.kernel.org,
        andersson@...nel.org, agross@...nel.org,
        marijn.suijten@...ainline.org, Rob Clark <robdclark@...il.com>,
        Abhinav Kumar <quic_abhinavk@...cinc.com>,
        Sean Paul <sean@...rly.run>, David Airlie <airlied@...il.com>,
        Daniel Vetter <daniel@...ll.ch>,
        Johan Hovold <johan+linaro@...nel.org>,
        Akhil P Oommen <quic_akhilpo@...cinc.com>,
        "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Nathan Chancellor <nathan@...nel.org>,
        dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/msm/adreno: adreno_gpu: Use suspend() instead of
 idle() on load error

On 30/03/2023 17:34, Konrad Dybcio wrote:
> 
> 
> On 29.03.2023 21:45, Dmitry Baryshkov wrote:
>> On Wed, 29 Mar 2023 at 18:48, Konrad Dybcio <konrad.dybcio@...aro.org> wrote:
>>>
>>>
>>>
>>> On 29.03.2023 16:37, Johan Hovold wrote:
>>>> On Wed, Mar 29, 2023 at 04:04:44PM +0200, Konrad Dybcio wrote:
>>>>> If we fail to initialize the GPU for whatever reason (say we don't
>>>>> embed the GPU firmware files in the initrd), the error path involves
>>>>> pm_runtime_put_sync() which then calls idle() instead of suspend().
>>>>>
>>>>> This is suboptimal, as it means that we're not going through the
>>>>> clean shutdown sequence. With at least A619_holi, this makes the GPU
>>>>> not wake up until it goes through at least one more start-fail-stop
>>>>> cycle. Fix that by using pm_runtime_put_sync_suspend to force a clean
>>>>> shutdown.
>>>>
>>>> This does not sound right. If pm_runtime_put_sync() fails to suspend the
>>>> device when the usage count drops to zero, then you have a bug somewhere
>>>> else.
>>> I was surprised to see that it was not called as well, but I wasn't able
>>> to track it down before..
>>
>> Could you please check that it's autosuspend who kicks in? In other
>> words, if we disable autosuspend, the pm_runtime_put_sync is enough()?
>>
>> That would probably mean that we lack some kind of reset in the hw_init path.
>>
>> On the other hand, I do not know how the device will react to the
>> error-in-the-middle state. Modems for example, can enter the state
>> where you can not properly turn it off once it starts the boot
>> process.
>>
>> And if we remember the efforts that Akhil has put into making sure
>> that the GPU is properly reset in case of an _error_, it might be
>> nearly impossible to shut it down in a proper way.
>>
>> Thus said, I think that unless there is an obvious way to restart the
>> init process, Korad's pm_runtime_put_sync_suspend() looks like a
>> correct fix to me.
> On the GPU side, when you cut GX and CX GDSCs, the hardware is off.
> Some clock / gdsc logic may be retained, but the GPU itself gets
> cut off. Parking the clocks and shuttting down VDD_GX (if exists)
> only makes that stronger.

If I remember correctly, GX and CX GPU GDSCs might be voted by other 
users. Again, I'd direct you here to the series at [1]

[1]: https://patchwork.freedesktop.org/series/111966/

-- 
With best wishes
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ