linux-kernel - Re: [PATCH v4 2/5] drm/nouveau/uvmm: Allow larger pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <15cc6de2-d1af-4504-a08d-1278329a2113@nvidia.com>
Date: Mon, 3 Nov 2025 17:12:09 -0800
From: James Jones <jajones@...dia.com>
To: Mohamed Ahmed <mohamedahmedegypt2001@...il.com>
Cc: linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
 Mary Guillemard <mary@...y.zone>,
 Faith Ekstrand <faith.ekstrand@...labora.com>, Lyude Paul
 <lyude@...hat.com>, Danilo Krummrich <dakr@...nel.org>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 nouveau@...ts.freedesktop.org
Subject: Re: [PATCH v4 2/5] drm/nouveau/uvmm: Allow larger pages

On 11/3/25 15:53, Mohamed Ahmed wrote:
> Thanks a lot for the shout out! Looking more at things, the logic here
> is actually redundant. It was originally copied over directly from the
> bo allocation code to stay on the safer side (basically the idea back
> then was to make both the bo and vmm sides match exactly). We aren't
> at risk of having an aligned address that is in the wrong memory type
> because the bo allocation code (nouveau_bo.c:321) forces anything that
> has the GART flag to have a page size of 4K. Anything getting a page
> size higher than that is exclusively VRAM only. Additionally,
> currently things marked VRAM only don't get evicted to host memory
> except under high memory pressure and in that case, the context is
> paused until the objects in question are paged back in, so we also
> don't have to worry about memory placement there.
> 
> The memory placement check in the vmm code could be removed but I am
> leaning more towards leaving it as is just to stay on the safer side.
> At the same time, it would be more useful to keep it for the future as
> one of the future investigation targets that we want to look into is
> all the memory placement rules because the "only 4K is allowed for
> host memory" limit that nouveau imposes is a source of many pains in
> userspace (originally thought to be a HW thing but seems it's actually
> not), and having the checks on both bo and vmm paths would help
> starting out with that.

OK, thanks for the explanation. I'm fine with leaving the check as-is in 
that case.

Given that, for the series:

Reviewed-by: James Jones <jajones@...dia.com>

Thanks,
-James

> Thanks a lot again,
> Mohamed
> 
> On Fri, Oct 31, 2025 at 7:01 PM James Jones <jajones@...dia.com> wrote:
>>
>> On 10/31/25 03:49, Mohamed Ahmed wrote:
>>> From: Mary Guillemard <mary@...y.zone>
>>>
>>> Now that everything in UVMM knows about the variable page shift, we can
>>> select larger values.
>>>
>>> The proposed approach relies on nouveau_bo::page unless if it would cause
>>> alignment issues (in which case we fall back to searching for an
>>> appropriate shift)
>>>
>>> Signed-off-by: Mary Guillemard <mary@...y.zone>
>>> Co-developed-by: Mohamed Ahmed <mohamedahmedegypt2001@...il.com>
>>> Signed-off-by: Mohamed Ahmed <mohamedahmedegypt2001@...il.com>
>>> ---
>>>    drivers/gpu/drm/nouveau/nouveau_uvmm.c | 60 +++++++++++++++++++++++++-
>>>    1 file changed, 58 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> index 2cd0835b05e8..ab8933b88337 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c
>>> @@ -454,6 +454,62 @@ op_unmap_prepare_unwind(struct drm_gpuva *va)
>>>        drm_gpuva_insert(va->vm, va);
>>>    }
>>>
>>> +static bool
>>> +op_map_aligned_to_page_shift(const struct drm_gpuva_op_map *op, u8 page_shift)
>>> +{
>>> +     u64 non_page_bits = (1ULL << page_shift) - 1;
>>> +
>>> +     return (op->va.addr & non_page_bits) == 0 &&
>>> +            (op->va.range & non_page_bits) == 0 &&
>>> +            (op->gem.offset & non_page_bits) == 0;
>>> +}
>>> +
>>> +static u8
>>> +select_page_shift(struct nouveau_uvmm *uvmm, struct drm_gpuva_op_map *op)
>>> +{
>>> +     struct nouveau_bo *nvbo = nouveau_gem_object(op->gem.obj);
>>> +
>>> +     /* nouveau_bo_fixup_align() guarantees that the page size will be aligned
>>> +      * for most cases, but it can't handle cases where userspace allocates with
>>> +      * a size and then binds with a smaller granularity. So in order to avoid
>>> +      * breaking old userspace, we need to ensure that the VA is actually
>>> +      * aligned before using it, and if it isn't, then we downgrade to the first
>>> +      * granularity that will fit, which is optimal from a correctness and
>>> +      * performance perspective.
>>> +      */
>>> +     if (op_map_aligned_to_page_shift(op, nvbo->page))
>>> +             return nvbo->page;
>>> +
>>> +     struct nouveau_mem *mem = nouveau_mem(nvbo->bo.resource);
>>> +     struct nvif_vmm *vmm = &uvmm->vmm.vmm;
>>> +     int i;
>>> +
>>> +     /* If the given granularity doesn't fit, let's find one that will fit. */
>>> +     for (i = 0; i < vmm->page_nr; i++) {
>>> +             /* Ignore anything that is bigger or identical to the BO preference. */
>>> +             if (vmm->page[i].shift >= nvbo->page)
>>> +                     continue;
>>> +
>>> +             /* Skip incompatible domains. */
>>> +             if ((mem->mem.type & NVIF_MEM_VRAM) && !vmm->page[i].vram)
>>> +                     continue;
>>> +             if ((mem->mem.type & NVIF_MEM_HOST) &&
>>> +                 (!vmm->page[i].host || vmm->page[i].shift > PAGE_SHIFT))
>>> +                     continue;
>>
>> This logic doesn't seem correct. I'm not sure why there's a need to
>> limit the page size on the host memory type, but assuming there is due
>> to nouveau architecture or HW limitations I'm not aware of, it should be
>> applied universally, not just when falling back due to misaligned
>> addresses. You can get lucky and have aligned addresses regardless of
>> the target page size. Hence, this check would need to precede the above
>> early-out for the case where op_map_aligned_to_page_shift() succeeds.
>>
>> Thanks,
>> -James
>>
>>> +             /* If it fits, return the proposed shift. */
>>> +             if (op_map_aligned_to_page_shift(op, vmm->page[i].shift))
>>> +                     return vmm->page[i].shift;
>>> +     }
>>> +
>>> +     /* If we get here then nothing can reconcile the requirements. This should never
>>> +      * happen.
>>> +      */
>>> +     WARN_ON(1);
>>> +
>>> +     return PAGE_SHIFT;
>>> +}
>>> +
>>>    static void
>>>    nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
>>>                               struct nouveau_uvma_prealloc *new,
>>> @@ -506,7 +562,7 @@ nouveau_uvmm_sm_prepare_unwind(struct nouveau_uvmm *uvmm,
>>>                        if (vmm_get_range)
>>>                                nouveau_uvmm_vmm_put(uvmm, vmm_get_start,
>>>                                                     vmm_get_range,
>>> -                                                  PAGE_SHIFT);
>>> +                                                  select_page_shift(uvmm, &op->map));
>>>                        break;
>>>                }
>>>                case DRM_GPUVA_OP_REMAP: {
>>> @@ -599,7 +655,7 @@ op_map_prepare(struct nouveau_uvmm *uvmm,
>>>
>>>        uvma->region = args->region;
>>>        uvma->kind = args->kind;
>>> -     uvma->page_shift = PAGE_SHIFT;
>>> +     uvma->page_shift = select_page_shift(uvmm, op);
>>>
>>>        drm_gpuva_map(&uvmm->base, &uvma->va, op);
>>>
>>