linux-kernel - Re: [PATCH] drm/ttm: fix potential null ptr deref in when mem space alloc fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a29021bd-1447-038a-e141-d06b53173d36@collabora.com>
Date:   Mon, 21 Mar 2022 15:44:13 +0000
From:   Robert Beckett <bob.beckett@...labora.com>
To:     Christian König <christian.koenig@....com>,
        dri-devel@...ts.freedesktop.org, Huang Rui <ray.huang@....com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        Matthew Auld <matthew.auld@...el.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/ttm: fix potential null ptr deref in when mem space
 alloc fails



On 21/03/2022 09:51, Christian König wrote:
> Am 18.03.22 um 20:50 schrieb Robert Beckett:
>> when allocating a resource in place it is common to free the buffer's
>> resource, then allocate a new resource in a different placement.
>>
>> e.g. amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls
>> ttm_bo_mem_space.
> 
> Well yes I'm working the drivers towards this, but NAK at the moment. 
> Currently bo->resource is never expected to be NULL.
> 
> And yes I'm searching for this bug in amdgpu for quite a while. Where 
> exactly does that happen?

in my case, I am writing new code for i915 that does this. I will switch 
it to allocate the new resource first, then free the old one if successful.

For the existing amd case, see 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c?h=v5.17#n384


amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls 
ttm_bo_mem_space. If the ttm_bo_mem_space call fails (e.g. due to memory 
pressure), then the error path will try to deref bo->resource, which 
will be null at that point.


to fix this, I honestly don't see a reason to not also have the safety 
check for null there. It could check early and return an error if it is 
null. I think that defensive programming here makes sense, better than a 
null deref if someone programs it wrong.



> 
> Amdgpu is supposed to allocate a new resource first, then do a swap and 
> the free the old one.
> 
> Thanks,
> Christian.
> 
>>
>> In this situation, bo->resource will be null as it is cleared during
>> the initial freeing of the previous resource.
>> This leads to a null deref.
>>
>> Fixes: d3116756a710 (drm/ttm: rename bo->mem and make it a pointer)
>>
>> Signed-off-by: Robert Beckett <bob.beckett@...labora.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index db3dc7ef5382..62b29ee7d040 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -875,7 +875,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
>>       }
>>   error:
>> -    if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count)
>> +    if (bo->resource && bo->resource->mem_type == TTM_PL_SYSTEM && 
>> !bo->pin_count)
>>           ttm_bo_move_to_lru_tail_unlocked(bo);
>>       return ret;
>