lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d320aac-e928-4fd0-812c-268a3a943575@oss.qualcomm.com>
Date: Thu, 24 Jul 2025 13:46:28 +0200
From: Konrad Dybcio <konrad.dybcio@....qualcomm.com>
To: Akhil P Oommen <akhilpo@....qualcomm.com>,
        Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
Cc: Rob Clark <robin.clark@....qualcomm.com>, Sean Paul <sean@...rly.run>,
        Konrad Dybcio <konradybcio@...nel.org>,
        Dmitry Baryshkov <lumag@...nel.org>,
        Abhinav Kumar <abhinav.kumar@...ux.dev>,
        Jessica Zhang <jessica.zhang@....qualcomm.com>,
        Marijn Suijten <marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
        linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 07/17] drm/msm/adreno: Add fenced regwrite support

On 7/23/25 11:06 PM, Akhil P Oommen wrote:
> On 7/22/2025 8:22 PM, Konrad Dybcio wrote:
>> On 7/22/25 3:39 PM, Dmitry Baryshkov wrote:
>>> On Sun, Jul 20, 2025 at 05:46:08PM +0530, Akhil P Oommen wrote:
>>>> There are some special registers which are accessible even when GX power
>>>> domain is collapsed during an IFPC sleep. Accessing these registers
>>>> wakes up GPU from power collapse and allow programming these registers
>>>> without additional handshake with GMU. This patch adds support for this
>>>> special register write sequence.
>>>>
>>>> Signed-off-by: Akhil P Oommen <akhilpo@....qualcomm.com>
>>>> ---
>>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 63 ++++++++++++++++++++++++++++++-
>>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |  1 +
>>>>  drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 20 +++++-----
>>>>  3 files changed, 73 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> index 491fde0083a202bec7c6b3bca88d0e5a717a6560..8c004fc3abd2896d467a9728b34e99e4ed944dc4 100644
>>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>>>> @@ -16,6 +16,67 @@
>>>>  
>>>>  #define GPU_PAS_ID 13
>>>>  
>>>> +static bool fence_status_check(struct msm_gpu *gpu, u32 offset, u32 value, u32 status, u32 mask)
>>>> +{
>>>> +	/* Success if !writedropped0/1 */
>>>> +	if (!(status & mask))
>>>> +		return true;
>>>> +
>>>> +	udelay(10);
>>>
>>> Why do we need udelay() here? Why can't we use interval setting inside
>>> gmu_poll_timeout()?
>>
>> Similarly here:
>>
>> [...]
>>
>>>> +	if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, status,
>>>> +			fence_status_check(gpu, offset, value, status, mask), 0, 1000))
>>>> +		return 0;
>>>> +
>>>> +	dev_err_ratelimited(gmu->dev, "delay in fenced register write (0x%x)\n",
>>>> +			offset);
>>>> +
>>>> +	/* Try again for another 1ms before failing */
>>>> +	gpu_write(gpu, offset, value);
>>>> +	if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, status,
>>>> +			fence_status_check(gpu, offset, value, status, mask), 0, 1000))
>>>> +		return 0;
>>>> +
>>>> +	dev_err_ratelimited(gmu->dev, "fenced register write (0x%x) fail\n",
>>>> +			offset);
>>
>> We may want to combine the two, so as not to worry the user too much..
>>
>> If it's going to fail, I would assume it's going to fail both checks
>> (unless e.g. the bus is so congested a single write can't go through
>> to a sleepy GPU across 2 miliseconds, but that's another issue)
> 
> In case of success, we cannot be sure if the first write went through.
> So we should poll separately.

You're writing to it 2 (outside fence_status_check) + 2*1000/10 (inside)
== 202 times, it really better go through..

If it's just about the write reaching the GPU, you can write it once and
read back the register you've written to, this way you're sure that the
GPU can observe the write

Konrad

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ