[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20c82f01-af9f-f1b9-ec0e-9f66e5769894@amd.com>
Date: Tue, 2 Jul 2019 18:24:54 +0000
From: "Kuehling, Felix" <Felix.Kuehling@....com>
To: "Liu, Shaoyun" <Shaoyun.Liu@....com>,
Colin King <colin.king@...onical.com>,
Oded Gabbay <oded.gabbay@...il.com>,
"Deucher, Alexander" <Alexander.Deucher@....com>,
"Koenig, Christian" <Christian.Koenig@....com>,
"Zhou, David(ChunMing)" <David1.Zhou@....com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
"amd-gfx@...ts.freedesktop.org" <amd-gfx@...ts.freedesktop.org>
CC: "kernel-janitors@...r.kernel.org" <kernel-janitors@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] drm/amdkfd: fix potential null pointer dereference on
pointer peer_dev
I think this could happen if KFD initialization fails for a device.
Currently we'd add the device, and then remove it again. That may leave
a gap in the proximity domains. Oak just had a fix recently to clean
that up by only adding KFD devices to the topology after successful
initialization.
Regards,
Felix
On 2019-07-02 11:29 a.m., Liu, Shaoyun wrote:
> From the comments , "we will loop GPUs that already be processed (with
> lower value of proximity_domain) ", the device should already been
> added into the topology_device_list. So in this case ,
> kfd_topology_device_by_proximity_domain will not return a NULL pointer.
> If you really get the null pointer dereferences here , we must have
> some bigger problem and can not solved by added the null check here.
>
> Regards
>
> shaoyun.liu
>
> On 2019-06-29 9:31 a.m., Colin King wrote:
>> From: Colin Ian King <colin.king@...onical.com>
>>
>> The call to kfd_topology_device_by_proximity_domain can return a NULL
>> pointer so add a null pointer check on peer_dev to the existing null
>> pointer check on peer_dev->gpu to avoid any potential null pointer
>> dereferences.
>>
>> Addresses-Coverity: ("Dereference on null return value")
>> Fixes: ae9a25aea7f3 ("drm/amdkfd: Generate xGMI direct iolink")
>> Signed-off-by: Colin Ian King <colin.king@...onical.com>
>> ---
>> drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> index 4e3fc284f6ac..cb6b46cfa6c2 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
>> @@ -1293,7 +1293,7 @@ static int kfd_create_vcrat_image_gpu(void *pcrat_image,
>> if (kdev->hive_id) {
>> for (nid = 0; nid < proximity_domain; ++nid) {
>> peer_dev = kfd_topology_device_by_proximity_domain(nid);
>> - if (!peer_dev->gpu)
>> + if (!peer_dev || !peer_dev->gpu)
>> continue;
>> if (peer_dev->gpu->hive_id != kdev->hive_id)
>> continue;
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@...ts.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Powered by blists - more mailing lists