linux-kernel - Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0b984f96-00fb-5410-bb16-02e12b2cc024@shipmail.org>
Date:   Wed, 24 Mar 2021 16:50:14 +0100
From:   Thomas Hellström (Intel) 
        <thomas_os@...pmail.org>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     David Airlie <airlied@...ux.ie>, linux-kernel@...r.kernel.org,
        dri-devel@...ts.freedesktop.org, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christian Koenig <christian.koenig@....com>
Subject: Re: [RFC PATCH 1/2] mm,drm/ttm: Block fast GUP to TTM huge pages


On 3/24/21 2:48 PM, Jason Gunthorpe wrote:
> On Wed, Mar 24, 2021 at 02:35:38PM +0100, Thomas Hellström (Intel) wrote:
>
>>> In an ideal world the creation/destruction of page table levels would
>>> by dynamic at this point, like THP.
>> Hmm, but I'm not sure what problem we're trying to solve by changing the
>> interface in this way?
> We are trying to make a sensible driver API to deal with huge pages.
>   
>> Currently if the core vm requests a huge pud, we give it one, and if we
>> can't or don't want to (because of dirty-tracking, for example, which is
>> always done on 4K page-level) we just return VM_FAULT_FALLBACK, and the
>> fault is retried at a lower level.
> Well, my thought would be to move the pte related stuff into
> vmf_insert_range instead of recursing back via VM_FAULT_FALLBACK.
>
> I don't know if the locking works out, but it feels cleaner that the
> driver tells the vmf how big a page it can stuff in, not the vm
> telling the driver to stuff in a certain size page which it might not
> want to do.
>
> Some devices want to work on a in-between page size like 64k so they
> can't form 2M pages but they can stuff 64k of 4K pages in a batch on
> every fault.

Hmm, yes, but we would in that case be limited anyway to insert ranges 
smaller than and equal to the fault size to avoid extensive and possibly 
unnecessary checks for contigous memory. And then if we can't support 
the full fault size, we'd need to either presume a size and alignment of 
the next level or search for contigous memory in both directions around 
the fault address, perhaps unnecessarily as well. I do think the current 
interface works ok, as we're just acting on what the core vm tells us to do.

/Thomas

>
> That idea doesn't fit naturally if the VM is driving the size.
>
> Jason