lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_NGLrrFmFHFX2bC7naByJGofEiYQyWvRP6CO4BDFo52TQ@mail.gmail.com>
Date: Fri, 17 May 2024 11:46:16 -0400
From: Alex Deucher <alexdeucher@...il.com>
To: Christian König <christian.koenig@....com>
Cc: Tim Van Patten <timvp@...omium.org>, LKML <linux-kernel@...r.kernel.org>, 
	alexander.deucher@....com, prathyushi.nangia@....com, 
	Tim Van Patten <timvp@...gle.com>, Daniel Vetter <daniel@...ll.ch>, David Airlie <airlied@...il.com>, 
	Felix Kuehling <Felix.Kuehling@....com>, Ikshwaku Chauhan <ikshwaku.chauhan@....com>, Le Ma <le.ma@....com>, 
	Lijo Lazar <lijo.lazar@....com>, Mario Limonciello <mario.limonciello@....com>, 
	"Pan, Xinhui" <Xinhui.Pan@....com>, "Shaoyun.liu" <Shaoyun.liu@....com>, 
	Shiwu Zhang <shiwu.zhang@....com>, Srinivasan Shanmugam <srinivasan.shanmugam@....com>, 
	amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

On Fri, May 17, 2024 at 2:35 AM Christian König
<christian.koenig@....com> wrote:
>
> Am 16.05.24 um 19:57 schrieb Tim Van Patten:
> > From: Tim Van Patten <timvp@...gle.com>
> >
> > The following commit updated gmc->noretry from 0 to 1 for GC HW IP
> > 9.3.0:
> >
> >      commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
> >
> > This causes the device to hang when a page fault occurs, until the
> > device is rebooted. Instead, revert back to gmc->noretry=0 so the device
> > is still responsive.
>
> Wait a second. Why does the device hang on a page fault? That shouldn't
> happen independent of noretry.
>
> So that strongly sounds like this is just hiding a bug elsewhere.

Fair enough, but this is also the only gfx9 APU which defaults to
noretry=1, all of the rest are dGPUs.  I'd argue it should align with
the other GFX9 APUs or they should all enable noretry=1.

Alex

>
> Regards,
> Christian.
>
> >
> > Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
> > Signed-off-by: Tim Van Patten <timvp@...gle.com>
> > ---
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 -
> >   1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > index be4629cdac049..bff54a20835f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -876,7 +876,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
> >       struct amdgpu_gmc *gmc = &adev->gmc;
> >       uint32_t gc_ver = amdgpu_ip_version(adev, GC_HWIP, 0);
> >       bool noretry_default = (gc_ver == IP_VERSION(9, 0, 1) ||
> > -                             gc_ver == IP_VERSION(9, 3, 0) ||
> >                               gc_ver == IP_VERSION(9, 4, 0) ||
> >                               gc_ver == IP_VERSION(9, 4, 1) ||
> >                               gc_ver == IP_VERSION(9, 4, 2) ||
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ