lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46c64cf0-4cd5-4b6f-a224-ffe4bac5bb7a@linux.dev>
Date: Wed, 2 Oct 2024 03:26:46 +0800
From: Sui Jingfeng <sui.jingfeng@...ux.dev>
To: Lucas Stach <l.stach@...gutronix.de>,
 Russell King <linux+etnaviv@...linux.org.uk>,
 Christian Gmeiner <christian.gmeiner@...il.com>
Cc: David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
 etnaviv@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/etnaviv: Print error message if inserting IOVA
 address range fails

Hi,

On 2024/10/1 16:27, Lucas Stach wrote:
> Hi Sui,
>
> Am Dienstag, dem 01.10.2024 um 06:17 +0800 schrieb Sui Jingfeng:
>> Etnaviv assumes that GPU page size is 4KiB, yet on some systems, the CPU
>> page size is 16 KiB. The size of etnaviv buffer objects will be aligned
>> to CPU page size on kernel side, however, userspace still assumes the
>> page size is 4KiB and doing allocation with 4KiB page as unit. This
>> results in softpin(userspace managed per-process address spaces) fails.
>> Because kernel side BO takes up bigger address space than user space
>> assumes whenever the size of a BO is not CPU page size aligned.
>>
> Seems we need to track the GPU and CPU allocation sizes separately.


The idea is cool and fancy, I have been tried.

By adding a 'user_size' member into the struct etnaviv_gem_object,
and use this 'user_size'; to track the actual size that user-space
thing of. (or in other words, the actual size that potential user
allow to use)

Using 'user_size' is pin, this partly solve VA address space collision
under softpin fashion. This is partly works under my hasty test. But ...

> Userspace is correct in assuming that the GPU page size is 4K and
> buffers are aligned to this granule.


Vivante GPU support 4KB and 64KB GPU page size.


>   There should be no need to waste GPU VA space


We have nearly 4GBGPU VA space, As far as I can see it, we only use a few. So, is it true 
that we are wealthy about the VA space?


> just because the CPU page size is larger than that and we
> need to overallocate buffers to suit the CPU.


A single CPU page share the same caching property, therefore, I image that
asingle VA address  range at least should occupy entire room of a single CPU
page.

Otherwise, it possible that 4 GPUVA share a single CPU page.
if each GPUVA  mapped with a different caching property from others.
This get coherency requirements involved.


>> Insert an error message to help debug when such an issue happen.
>>
>> Signed-off-by: Sui Jingfeng <sui.jingfeng@...ux.dev>
>> ---
>> For example, when running glmark2-drm:
>>
>> [kernel space debug log]
>>
>>   etnaviv 0000:03:00.0: Insert bo failed, va: fd38b000, size: 4000
>>   etnaviv 0000:03:00.0: Insert bo failed, va: fd38a000, size: 4000
>>
>> [user space debug log]
>>
>> bo->va = 0xfd48c000, bo->size=100000
>> bo->va = 0xfd38c000, bo->size=100000
>> bo->va = 0xfd38b000, bo->size=1000   <-- Insert IOVA fails started at here.
>> bo->va = 0xfd38a000, bo->size=1000
>> bo->va = 0xfd389000, bo->size=1000
>>
>> [texture] texture-filter=nearest:MESA: error: etna_cmd_stream_flush:238: submit failed: -28 (No space left on device)
>> ---
>>   drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
>> index 1661d589bf3e..682f27b27d59 100644
>> --- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
>> @@ -310,8 +310,12 @@ int etnaviv_iommu_map_gem(struct etnaviv_iommu_context *context,
>>   	else
>>   		ret = etnaviv_iommu_find_iova(context, node,
>>   					      etnaviv_obj->base.size);
>> -	if (ret < 0)
>> +	if (ret < 0) {
>> +		dev_err(context->global->dev,
>> +			"Insert iova failed, va: %llx, size: %zx\n",
>> +			va, etnaviv_obj->base.size);
> As this might happen for a lot of buffers in a single submit and
> userspace might be unimpressed by the submit failure and keep pushing
> new submits, this has a potential to spam the logs. Please use
> dev_err_ratelimited. Other than that, this patch looks good.
>
> Regards,
> Lucas
>
>>   		goto unlock;
>> +	}
>>   
>>   	mapping->iova = node->start;
>>   	ret = etnaviv_iommu_map(context, node->start, sgt,

-- 
Best regards,
Sui


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ