linux-kernel - Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210325133347.GY2356281@nvidia.com>
Date:   Thu, 25 Mar 2021 10:33:47 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     Christian König <christian.koenig@....com>
Cc:     Thomas Hellström (Intel) 
        <thomas_os@...pmail.org>, David Airlie <airlied@...ux.ie>,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages

On Thu, Mar 25, 2021 at 02:26:50PM +0100, Christian König wrote:
> Am 25.03.21 um 14:17 schrieb Jason Gunthorpe:
> > On Thu, Mar 25, 2021 at 02:05:14PM +0100, Christian König wrote:
> > > 
> > > Am 25.03.21 um 13:42 schrieb Jason Gunthorpe:
> > > > On Thu, Mar 25, 2021 at 01:09:14PM +0100, Christian König wrote:
> > > > > Am 25.03.21 um 13:01 schrieb Jason Gunthorpe:
> > > > > > On Thu, Mar 25, 2021 at 12:53:15PM +0100, Thomas Hellström (Intel) wrote:
> > > > > > 
> > > > > > > Nope. The point here was that in this case, to make sure mmap uses the
> > > > > > > correct VA to give us a reasonable chance of alignement, the driver might
> > > > > > > need to be aware of and do trickery with the huge page-table-entry sizes
> > > > > > > anyway, although I think in most cases a standard helper for this can be
> > > > > > > supplied.
> > > > > > Of course the driver needs some way to influence the VA mmap uses,
> > > > > > gernally it should align to the natural page size of the device
> > > > > Well a mmap() needs to be aligned to the page size of the CPU, but not
> > > > > necessarily to the one of the device.
> > > > > 
> > > > > So I'm pretty sure the device driver should not be involved in any way the
> > > > > choosing of the VA for the CPU mapping.
> > > > No, if the device wants to use huge pages it must influence the mmap
> > > > VA or it can't form huge pgaes.
> > > No, that's the job of the core MM and not of the individual driver.
> > The core mm doesn't know the page size of the device, only which of
> > several page levels the arch supports. The device must be involevd
> > here.
> 
> Why? See you can have a device which has for example 256KiB pages, but it
> should perfectly work that the CPU mapping is aligned to only 4KiB.

The goal is to optimize large page size usage in the page tables.

There are three critera that impact this:
 1) The possible CPU page table sizes
 2) The useful contiguity the device can create in its iomemory
 3) The VA's alignment, as this sets an upper bound on 1 and 2

If a device has 256k pages and the arch supports 2M and 4k then the VA
should align to somewhere between 4k and 256k. The ideal alignment
would be to optimize PTE usage when stuffing 256k blocks by fully
populating PTEs and depends on the arch's # of PTE's per page.

If a device has 256k pages and the arch supports 256k pages then the
VA should align to 256k.

The device should never be touching any of this, it should simply
inform what its operating page size is and the MM should use that to
align the VA.

Jason