lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Jun 2020 18:42:00 -0700
From:   Ralph Campbell <rcampbell@...dia.com>
To:     John Hubbard <jhubbard@...dia.com>,
        <nouveau@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>
CC:     Jerome Glisse <jglisse@...hat.com>, Christoph Hellwig <hch@....de>,
        "Jason Gunthorpe" <jgg@...lanox.com>,
        Ben Skeggs <bskeggs@...hat.com>
Subject: Re: [RESEND PATCH 2/3] nouveau: fix mixed normal and device private
 page migration


On 6/22/20 5:30 PM, John Hubbard wrote:
> On 2020-06-22 16:38, Ralph Campbell wrote:
>> The OpenCL function clEnqueueSVMMigrateMem(), without any flags, will
>> migrate memory in the given address range to device private memory. The
>> source pages might already have been migrated to device private memory.
>> In that case, the source struct page is not checked to see if it is
>> a device private page and incorrectly computes the GPU's physical
>> address of local memory leading to data corruption.
>> Fix this by checking the source struct page and computing the correct
>> physical address.
>>
>> Signed-off-by: Ralph Campbell <rcampbell@...dia.com>
>> ---
>>   drivers/gpu/drm/nouveau/nouveau_dmem.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>> index cc9993837508..f6a806ba3caa 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
>> @@ -540,6 +540,12 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
>>       if (!(src & MIGRATE_PFN_MIGRATE))
>>           goto out;
>> +    if (spage && is_device_private_page(spage)) {
>> +        paddr = nouveau_dmem_page_addr(spage);
>> +        *dma_addr = DMA_MAPPING_ERROR;
>> +        goto done;
>> +    }
>> +
>>       dpage = nouveau_dmem_page_alloc_locked(drm);
>>       if (!dpage)
>>           goto out;
>> @@ -560,6 +566,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
>>               goto out_free_page;
>>       }
>> +done:
>>       *pfn = NVIF_VMM_PFNMAP_V0_V | NVIF_VMM_PFNMAP_V0_VRAM |
>>           ((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
>>       if (src & MIGRATE_PFN_WRITE)
>> @@ -615,6 +622,7 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
>>       struct migrate_vma args = {
>>           .vma        = vma,
>>           .start        = start,
>> +        .src_owner    = drm->dev,
> 
> Hi Ralph,
> 
> This .src_owner setting does look like a required fix, but it seems like
> a completely separate fix from what is listed in this patch's commit
> description, right? (It feels like a casualty of rearranging the patches.)
> 
> 
> thanks,

It's a bit more complex. There is a catch-22 here with the change to mm/migrate.c.
Without this patch or mm/migrate.c, a second call to clEnqueueSVMMigrateMem()
for the same address range will invalidate the GPU mapping to device private memory
created by the first call.
With this patch but not mm/migrate.c, the first call to clEnqueueSVMMigrateMem()
will fail to migrate normal anonymous memory to device private memory.
Without this patch but including the change to mm/migrate.c, a second call to
clEnqueueSVMMigrateMem() will crash the kernel because dma_map_page() will be
called with the device private PFN which is not a valid CPU physical address.
With both changes, a range of anonymous and device private pages can be migrated
to the GPU and the GPU page tables updated properly.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ