lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <04f35ef6-d6d6-4a30-89fe-6e578c3f03be@arm.com>
Date: Wed, 16 Jul 2025 16:59:00 +0100
From: Steven Price <steven.price@....com>
To: Caterina Shablia <caterina.shablia@...labora.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Frank Binns <frank.binns@...tec.com>, Matt Coster <matt.coster@...tec.com>,
 Karol Herbst <kherbst@...hat.com>, Lyude Paul <lyude@...hat.com>,
 Danilo Krummrich <dakr@...nel.org>,
 Boris Brezillon <boris.brezillon@...labora.com>,
 Liviu Dudau <liviu.dudau@....com>, Lucas De Marchi
 <lucas.demarchi@...el.com>,
 Thomas Hellström <thomas.hellstrom@...ux.intel.com>,
 Rodrigo Vivi <rodrigo.vivi@...el.com>
Cc: dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
 nouveau@...ts.freedesktop.org, intel-xe@...ts.freedesktop.org,
 asahi@...ts.linux.dev, Asahi Lina <lina@...hilina.net>
Subject: Re: [PATCH v4 7/7] drm/panthor: Add support for repeated mappings

On 15/07/2025 16:17, Caterina Shablia wrote:
> El viernes, 11 de julio de 2025 16:03:26 (hora de verano de Europa central), 
> Steven Price escribió:
>> On 07/07/2025 18:04, Caterina Shablia wrote:
>>> From: Boris Brezillon <boris.brezillon@...labora.com>
>>>
>>> This allows us to optimize mapping of a relatively small
>>> portion of a BO over and over in a large VA range, which
>>> is useful to support Vulkan sparse bindings in an efficient
>>> way.
>>>
>>> Signed-off-by: Boris Brezillon <boris.brezillon@...labora.com>
>>> Co-developed-by: Caterina Shablia <caterina.shablia@...labora.com>
>>> Signed-off-by: Caterina Shablia <caterina.shablia@...labora.com>
>>
>> This looks like the right sort of shape. From an uAPI perspective I'm
>> not sure whether u32 is the right type for bo_repeat_range. While I
>> can't immediately see a use for having a large repeat range it seems a
>> little silly to close it off when we're going to have padding afterwards
>> anyway. Obviously the kernel would reject large values because the
>> internal APIs are only u32. But it would enable this to be fixed if we
>> ever discover a usecase without a uAPI change.
>>
>>> ---
>>>
>>>  drivers/gpu/drm/panthor/panthor_drv.c |  3 +-
>>>  drivers/gpu/drm/panthor/panthor_mmu.c | 78 ++++++++++++++++++++++++---
>>>  include/uapi/drm/panthor_drm.h        | 23 ++++++++
>>>  3 files changed, 95 insertions(+), 9 deletions(-)
>>>

[...]

>>> +static int
>>> +panthor_vm_repeated_map_pages(struct panthor_vm *vm, u64 iova, int prot,
>>> +			      struct sg_table *sgt, u64 offset, u64 
> size,
>>> +			      u64 count)
>>> +{
>>> +	/* FIXME: we really need to optimize this at the io_pgtable level. 
> */
>>
>> Do you have plans for optimizing this? 
> I personally don't have any plans, no, but maybe Boris does. I'll forward this 
> question to him once he's back from his vacation.
>> How bad is the performance
>> without optimizing?
> It's better than the alternative of poking vm_bind with a morbillion 
> drm_panthor_vm_bind_ops. More seriously, I don't really have any workloads 
> beside VK CTS to measure, for now. There's some stuff we should try to do in 
> panvk first, like using a 2M dummy_page and doing some gymnastics when mapping 
> it so we get huge mappings, which will hopefully lessen the pressure on this 
> loop.

Ok, sounds like some more investigation is needed. I'm a little wary of
a new feature which has a FIXME like this as it sounds like the design
hasn't been tested yet. I'm happy with the current code if it's indeed
"good enough", but if it's still too slow to be actually usable then I
think we need to fix it before merging rather than have a new feature
which isn't actually fast enough to use.

>>
>>> +	for (u64 i = 0; i < count; i++) {
>>> +		int ret;
>>> +
>>> +		ret = panthor_vm_map_pages(vm, iova + (size * i), prot,
>>> +					   sgt, offset, size);
>>> +		if (ret) {
>>> +			panthor_vm_unmap_pages(vm, iova, size * (i - 
> 1));
>>> +			return ret;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>

[...]

>>>
>>> +	/**
>>> +	 * @bo_repeat_range: The size of the range to be repeated.
>>> +	 *
>>> +	 * Must be zero if DRM_PANTHOR_VM_BIND_OP_MAP_REPEAT is not set in
>>> +	 * flags.
>>> +	 *
>>> +	 * Size must be a multiple of bo_repeat_range.
>>> +	 */
>>> +	__u32 bo_repeat_range;
>>> +
>>> +	/** @pad: Padding field. MBZ. */
>>> +	__u32 pad;
>>
>> If we're going to have the padding then the kernel needs to check that
>> this padding is zero, so that it can be available for future use.
> I decided to go with your suggestion to change bo_repeat_range to be an __u64, 
> but rejecting vm_binds with values above 2^32-1 for now.

Yeah I think that's cleanest. Please do include a comment in the uapi
file about the 2^32-1 limit though.

Thanks,
Steve


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ