[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a552416e-fd32-4b84-b5d6-40a27530c939@redhat.com>
Date: Fri, 11 Oct 2024 13:33:12 +0200
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, kvm@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, Hugh Dickins <hughd@...gle.com>,
Thomas Huth <thuth@...hat.com>, "Matthew Wilcox (Oracle)"
<willy@...radead.org>, Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>, Leo Fu <bfu@...hat.com>
Subject: Re: [PATCH v1 2/2] mm: don't install PMD mappings when THPs are
disabled by the hw/process/vma
On 11.10.24 13:29, Ryan Roberts wrote:
> On 11/10/2024 11:24, David Hildenbrand wrote:
>> We (or rather, readahead logic :) ) might be allocating a THP in the
>> pagecache and then try mapping it into a process that explicitly disabled
>> THP: we might end up installing PMD mappings.
>>
>> This is a problem for s390x KVM, which explicitly remaps all PMD-mapped
>> THPs to be PTE-mapped in s390_enable_sie()->thp_split_mm(), before
>> starting the VM.
>>
>> For example, starting a VM backed on a file system with large folios
>> supported makes the VM crash when the VM tries accessing such a mapping
>> using KVM.
>>
>> Is it also a problem when the HW disabled THP using
>> TRANSPARENT_HUGEPAGE_UNSUPPORTED? At least on x86 this would be the case
>> without X86_FEATURE_PSE.
>>
>> In the future, we might be able to do better on s390x and only disallow
>> PMD mappings -- what s390x and likely TRANSPARENT_HUGEPAGE_UNSUPPORTED
>> really wants. For now, fix it by essentially performing the same check as
>> would be done in __thp_vma_allowable_orders() or in shmem code, where this
>> works as expected, and disallow PMD mappings, making us fallback to PTE
>> mappings.
>>
>> Reported-by: Leo Fu <bfu@...hat.com>
>> Fixes: 793917d997df ("mm/readahead: Add large folio readahead")
>
> Will this patch be difficult to backport given it depends on the previous patch
> and that doesn't have a Fixes tag?
"difficult" -- not really. Andrew might want to tag patch #1 with
"Fixes:" as well, but I can also send simple stable backports that avoid
patch #1.
(Thinking again, I assume we want to Cc:stable)
>
>> Cc: Thomas Huth <thuth@...hat.com>
>> Cc: Matthew Wilcox (Oracle) <willy@...radead.org>
>> Cc: Ryan Roberts <ryan.roberts@....com>
>> Cc: Christian Borntraeger <borntraeger@...ux.ibm.com>
>> Cc: Janosch Frank <frankja@...ux.ibm.com>
>> Cc: Claudio Imbrenda <imbrenda@...ux.ibm.com>
>> Signed-off-by: David Hildenbrand <david@...hat.com>
>> ---
>> mm/memory.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 2366578015ad..a2e501489517 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -4925,6 +4925,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
>> pmd_t entry;
>> vm_fault_t ret = VM_FAULT_FALLBACK;
>>
>> + /*
>> + * It is too late to allocate a small folio, we already have a large
>> + * folio in the pagecache: especially s390 KVM cannot tolerate any
>> + * PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any
>> + * PMD mappings if THPs are disabled.
>> + */
>> + if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags))
>> + return ret;
>
> Why not just call thp_vma_allowable_orders()?
Why call thp_vma_allowable_orders() that does a lot more work that
doesn't really apply here? :)
I'd say, just like shmem, we handle this separately here.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists