linux-kernel - Re: [PATCH] drm/amdkfd: Fix potential NULL pointer dereferences

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180110155849.Horde.DDGbi3ysasL2eHmvZ4k8adb@gator4166.hostgator.com>
Date:   Wed, 10 Jan 2018 15:58:49 -0600
From:   "Gustavo A. R. Silva" <garsilva@...eddedor.com>
To:     Felix Kuehling <felix.kuehling@....com>
Cc:     Oded Gabbay <oded.gabbay@...il.com>,
        Alex Deucher <alexander.deucher@....com>,
        Christian König <christian.koenig@....com>,
        David Airlie <airlied@...ux.ie>, amd-gfx@...ts.freedesktop.org,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/amdkfd: Fix potential NULL pointer dereferences

Hi Felix,

Quoting Felix Kuehling <felix.kuehling@....com>:

> Hi Gustavo,
>
> Thanks for catching that. When returning a fault, I think you also need
> to srcu_read_unlock(&kfd_processes_srcu, idx).
>
> However, instead of returning an error, I think I'd prefer to skip PDDs
> that can't be found with continue statements. That way others would
> still suspend and resume successfully. Maybe just print a WARN_ON for
> PDDs that aren't found, because that's an unexpected situation,
> currently. Maybe in the future it could be normal thing if we ever
> support GPU hotplug.
>

I got it. In that case, what do you think about the following patch instead?

index a22fb071..4ff5f0f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -461,7 +461,8 @@ int kfd_bind_processes_to_device(struct kfd_dev *dev)
         hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
                 mutex_lock(&p->mutex);
                 pdd = kfd_get_process_device_data(dev, p);
-               if (pdd->bound != PDD_BOUND_SUSPENDED) {
+
+               if (WARN_ON(!pdd) || pdd->bound != PDD_BOUND_SUSPENDED) {
                         mutex_unlock(&p->mutex);
                         continue;
                 }
@@ -501,6 +502,11 @@ void kfd_unbind_processes_from_device(struct  
kfd_dev *dev)
                 mutex_lock(&p->mutex);
                 pdd = kfd_get_process_device_data(dev, p);

+               if (WARN_ON(!pdd)) {
+                       mutex_unlock(&p->mutex);
+                       continue;
+               }
+
                 if (pdd->bound == PDD_BOUND)
                         pdd->bound = PDD_BOUND_SUSPENDED;
                 mutex_unlock(&p->mutex);


Thank you for the feedback.
--
Gustavo

> Regards,
>   Felix
>
>
> On 2018-01-10 11:50 AM, Gustavo A. R. Silva wrote:
>> In case kfd_get_process_device_data returns null, there are some
>> null pointer dereferences in functions kfd_bind_processes_to_device
>> and kfd_unbind_processes_from_device.
>>
>> Fix this by null checking pdd before dereferencing it.
>>
>> Addresses-Coverity-ID: 1463794 ("Dereference null return value")
>> Addresses-Coverity-ID: 1463772 ("Dereference null return value")
>> Signed-off-by: Gustavo A. R. Silva <garsilva@...eddedor.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c  
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index a22fb071..29d51d5 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -461,6 +461,13 @@ int kfd_bind_processes_to_device(struct kfd_dev *dev)
>>  	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
>>  		mutex_lock(&p->mutex);
>>  		pdd = kfd_get_process_device_data(dev, p);
>> +
>> +		if (!pdd) {
>> +			pr_err("Process device data doesn't exist\n");
>> +			mutex_unlock(&p->mutex);
>> +			return -EFAULT;
>> +		}
>> +
>>  		if (pdd->bound != PDD_BOUND_SUSPENDED) {
>>  			mutex_unlock(&p->mutex);
>>  			continue;
>> @@ -501,6 +508,11 @@ void kfd_unbind_processes_from_device(struct  
>> kfd_dev *dev)
>>  		mutex_lock(&p->mutex);
>>  		pdd = kfd_get_process_device_data(dev, p);
>>
>> +		if (!pdd) {
>> +			mutex_unlock(&p->mutex);
>> +			return;
>> +		}
>> +
>>  		if (pdd->bound == PDD_BOUND)
>>  			pdd->bound = PDD_BOUND_SUSPENDED;
>>  		mutex_unlock(&p->mutex);