lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_OHW9Sw5quFqk52ymGVKXe3PGidB9uLW9wcQcA=pCOTCA@mail.gmail.com>
Date: Wed, 13 Mar 2024 16:46:04 -0400
From: Alex Deucher <alexdeucher@...il.com>
To: Felix Kuehling <felix.kuehling@....com>
Cc: Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org, stable@...r.kernel.org, 
	Prike Liang <Prike.Liang@....com>, Alex Deucher <alexander.deucher@....com>, 
	christian.koenig@....com, Xinhui.Pan@....com, airlied@...il.com, 
	daniel@...ll.ch, Hawking.Zhang@....com, lijo.lazar@....com, le.ma@....com, 
	James.Zhu@....com, shane.xiao@....com, sonny.jiang@....com, 
	amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH AUTOSEL 5.15 3/5] drm/amdgpu: Enable gpu reset for S3
 abort cases on Raven series

On Wed, Mar 13, 2024 at 4:12 PM Felix Kuehling <felix.kuehling@....com> wrote:
>
> On 2024-03-11 11:14, Sasha Levin wrote:
> > From: Prike Liang <Prike.Liang@....com>
> >
> > [ Upstream commit c671ec01311b4744b377f98b0b4c6d033fe569b3 ]
> >
> > Currently, GPU resets can now be performed successfully on the Raven
> > series. While GPU reset is required for the S3 suspend abort case.
> > So now can enable gpu reset for S3 abort cases on the Raven series.
>
> This looks suspicious to me. I'm not sure what conditions made the GPU
> reset successful. But unless all the changes involved were also
> backported, this should probably not be applied to older kernel
> branches. I'm speculating it may be related to the removal of AMD IOMMUv2.
>

We should get confirmation from Prike, but I think he tested this on
older kernels as well.

Alex

> Regards,
>    Felix
>
>
> >
> > Signed-off-by: Prike Liang <Prike.Liang@....com>
> > Acked-by: Alex Deucher <alexander.deucher@....com>
> > Signed-off-by: Alex Deucher <alexander.deucher@....com>
> > Signed-off-by: Sasha Levin <sashal@...nel.org>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/soc15.c | 45 +++++++++++++++++-------------
> >   1 file changed, 25 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 6a3486f52d698..ef5b3eedc8615 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -605,11 +605,34 @@ soc15_asic_reset_method(struct amdgpu_device *adev)
> >               return AMD_RESET_METHOD_MODE1;
> >   }
> >
> > +static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)
> > +{
> > +     u32 sol_reg;
> > +
> > +     sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> > +
> > +     /* Will reset for the following suspend abort cases.
> > +      * 1) Only reset limit on APU side, dGPU hasn't checked yet.
> > +      * 2) S3 suspend abort and TOS already launched.
> > +      */
> > +     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > +                     !adev->suspend_complete &&
> > +                     sol_reg)
> > +             return true;
> > +
> > +     return false;
> > +}
> > +
> >   static int soc15_asic_reset(struct amdgpu_device *adev)
> >   {
> >       /* original raven doesn't have full asic reset */
> > -     if ((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > -         (adev->apu_flags & AMD_APU_IS_RAVEN2))
> > +     /* On the latest Raven, the GPU reset can be performed
> > +      * successfully. So now, temporarily enable it for the
> > +      * S3 suspend abort case.
> > +      */
> > +     if (((adev->apu_flags & AMD_APU_IS_RAVEN) ||
> > +         (adev->apu_flags & AMD_APU_IS_RAVEN2)) &&
> > +             !soc15_need_reset_on_resume(adev))
> >               return 0;
> >
> >       switch (soc15_asic_reset_method(adev)) {
> > @@ -1490,24 +1513,6 @@ static int soc15_common_suspend(void *handle)
> >       return soc15_common_hw_fini(adev);
> >   }
> >
> > -static bool soc15_need_reset_on_resume(struct amdgpu_device *adev)
> > -{
> > -     u32 sol_reg;
> > -
> > -     sol_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81);
> > -
> > -     /* Will reset for the following suspend abort cases.
> > -      * 1) Only reset limit on APU side, dGPU hasn't checked yet.
> > -      * 2) S3 suspend abort and TOS already launched.
> > -      */
> > -     if (adev->flags & AMD_IS_APU && adev->in_s3 &&
> > -                     !adev->suspend_complete &&
> > -                     sol_reg)
> > -             return true;
> > -
> > -     return false;
> > -}
> > -
> >   static int soc15_common_resume(void *handle)
> >   {
> >       struct amdgpu_device *adev = (struct amdgpu_device *)handle;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ