[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <498e0731-81a4-4f75-95b4-a8ad0bcc7665@huawei.com>
Date: Mon, 19 Aug 2024 21:14:26 +0800
From: Kefeng Wang <wangkefeng.wang@...wei.com>
To: Peter Xu <peterx@...hat.com>
CC: Jason Gunthorpe <jgg@...dia.com>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, Sean Christopherson <seanjc@...gle.com>,
Oscar Salvador <osalvador@...e.de>, Axel Rasmussen
<axelrasmussen@...gle.com>, <linux-arm-kernel@...ts.infradead.org>,
<x86@...nel.org>, Will Deacon <will@...nel.org>, Gavin Shan
<gshan@...hat.com>, Paolo Bonzini <pbonzini@...hat.com>, Zi Yan
<ziy@...dia.com>, Andrew Morton <akpm@...ux-foundation.org>, Catalin Marinas
<catalin.marinas@....com>, Ingo Molnar <mingo@...hat.com>, Alistair Popple
<apopple@...dia.com>, Borislav Petkov <bp@...en8.de>, David Hildenbrand
<david@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
<kvm@...r.kernel.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Alex
Williamson <alex.williamson@...hat.com>, Yan Zhao <yan.y.zhao@...el.com>
Subject: Re: [PATCH 00/19] mm: Support huge pfnmaps
On 2024/8/16 22:33, Peter Xu wrote:
> On Fri, Aug 16, 2024 at 11:05:33AM +0800, Kefeng Wang wrote:
>>
>>
>> On 2024/8/16 3:20, Peter Xu wrote:
>>> On Wed, Aug 14, 2024 at 09:37:15AM -0300, Jason Gunthorpe wrote:
>>>>> Currently, only x86_64 (1G+2M) and arm64 (2M) are supported.
>>>>
>>>> There is definitely interest here in extending ARM to support the 1G
>>>> size too, what is missing?
>>>
>>> Currently PUD pfnmap relies on THP_PUD config option:
>>>
>>> config ARCH_SUPPORTS_PUD_PFNMAP
>>> def_bool y
>>> depends on ARCH_SUPPORTS_HUGE_PFNMAP && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>>>
>>> Arm64 unfortunately doesn't yet support dax 1G, so not applicable yet.
>>>
>>> Ideally, pfnmap is too simple comparing to real THPs and it shouldn't
>>> require to depend on THP at all, but we'll need things like below to land
>>> first:
>>>
>>> https://lore.kernel.org/r/20240717220219.3743374-1-peterx@redhat.com
>>>
>>> I sent that first a while ago, but I didn't collect enough inputs, and I
>>> decided to unblock this series from that, so x86_64 shouldn't be affected,
>>> and arm64 will at least start to have 2M.
>>>
>>>>
>>>>> The other trick is how to allow gup-fast working for such huge mappings
>>>>> even if there's no direct sign of knowing whether it's a normal page or
>>>>> MMIO mapping. This series chose to keep the pte_special solution, so that
>>>>> it reuses similar idea on setting a special bit to pfnmap PMDs/PUDs so that
>>>>> gup-fast will be able to identify them and fail properly.
>>>>
>>>> Make sense
>>>>
>>>>> More architectures / More page sizes
>>>>> ------------------------------------
>>>>>
>>>>> Currently only x86_64 (2M+1G) and arm64 (2M) are supported.
>>>>>
>>>>> For example, if arm64 can start to support THP_PUD one day, the huge pfnmap
>>>>> on 1G will be automatically enabled.
>>
>> A draft patch to enable THP_PUD on arm64, only passed with DEBUG_VM_PGTABLE,
>> we may test pud pfnmaps on arm64.
>
> Thanks, Kefeng. It'll be great if this works already, as simple.
>
> Might be interesting to know whether it works already if you have some
> few-GBs GPU around on the systems.
>
> Logically as long as you have HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD selected
> below, 1g pfnmap will be automatically enabled when you rebuild the kernel.
> You can double check that by looking for this:
>
> CONFIG_ARCH_SUPPORTS_PUD_PFNMAP=y
>
> And you can try to observe the mappings by enabling dynamic debug for
> vfio_pci_mmap_huge_fault(), then map the bar with vfio-pci and read
> something from it.
I don't have such device, but we write a driver which use
vmf_insert_pfn_pmd/pud in huge_fault,
static const struct vm_operations_struct test_vm_ops = {
.huge_fault = test_huge_fault,
...
}
and read/write it after mmap(,2M/1G,test_fd,...), it works as expected,
since it could be used by dax, let's send it separately.
Powered by blists - more mailing lists