[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51018469-3bab-e56d-7407-b16170b5d74c@amd.com>
Date: Thu, 17 Feb 2022 10:57:28 -0500
From: Luben Tuikov <luben.tuikov@....com>
To: trix@...hat.com, alexander.deucher@....com,
christian.koenig@....com, Xinhui.Pan@....com, airlied@...ux.ie,
daniel@...ll.ch, nathan@...nel.org, ndesaulniers@...gle.com,
Hawking.Zhang@....com, john.clements@....com, tao.zhou1@....com,
YiPeng.Chai@....com, Stanley.Yang@....com, Dennis.Li@....com,
mukul.joshi@....com, nirmoy.das@....com
Cc: amd-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [PATCH] drm/amdgpu: fix amdgpu_ras_block_late_init error handler
Thanks for catching this.
Reviewed-by: Luben Tuikov <luben.tuikov@....com>
Regards,
Luben
On 2022-02-17 10:38, trix@...hat.com wrote:
> From: Tom Rix <trix@...hat.com>
>
> Clang build fails with
> amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
> whenever 'if' condition is true
> if (adev->in_suspend || amdgpu_in_reset(adev)) {
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> amdgpu_ras.c:2453:6: note: uninitialized use occurs here
> if (ras_obj->ras_cb)
> ^~~~~~~
>
> There is a logic error in the error handler's labels.
> ex/ The sysfs: is the last goto label in the normal code but
> is the middle of error handler. Rework the error handler.
>
> cleanup: is the first error, so it's handler should be last.
>
> interrupt: is the second error, it's handler is next. interrupt:
> handles the failure of amdgpu_ras_interrupt_add_hander() by
> calling amdgpu_ras_interrupt_remove_handler(). This is wrong,
> remove() assumes the interrupt has been setup, not torn down by
> add(). Change the goto label to cleanup.
>
> sysfs is the last error, it's handler should be first. sysfs:
> handles the failure of amdgpu_ras_sysfs_create() by calling
> amdgpu_ras_sysfs_remove(). But when the create() fails there
> is nothing added so there is nothing to remove. This error
> handler is not needed. Remove the error handler and change
> goto label to interrupt.
>
> Fixes: b293e891b057 ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)")
> Signed-off-by: Tom Rix <trix@...hat.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> index b5cd21cb6e58..c5c8a666110f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
> @@ -2432,12 +2432,12 @@ int amdgpu_ras_block_late_init(struct amdgpu_device *adev,
> if (ras_obj->ras_cb) {
> r = amdgpu_ras_interrupt_add_handler(adev, ras_block);
> if (r)
> - goto interrupt;
> + goto cleanup;
> }
>
> r = amdgpu_ras_sysfs_create(adev, ras_block);
> if (r)
> - goto sysfs;
> + goto interrupt;
>
> /* Those are the cached values at init.
> */
> @@ -2447,12 +2447,11 @@ int amdgpu_ras_block_late_init(struct amdgpu_device *adev,
> }
>
> return 0;
> -cleanup:
> - amdgpu_ras_sysfs_remove(adev, ras_block);
> -sysfs:
> +
> +interrupt:
> if (ras_obj->ras_cb)
> amdgpu_ras_interrupt_remove_handler(adev, ras_block);
> -interrupt:
> +cleanup:
> amdgpu_ras_feature_enable(adev, ras_block, 0);
> return r;
> }
Powered by blists - more mailing lists