[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <F10F1618-7153-41C7-A475-522D833C41D4@linux.dev>
Date: Fri, 3 Feb 2023 10:40:18 +0800
From: Muchun Song <muchun.song@...ux.dev>
To: Catalin Marinas <catalin.marinas@....com>
Cc: Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
linux-arm-kernel@...ts.infradead.org,
Mark Rutland <mark.rutland@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Mark Brown <broonie@...nel.org>
Subject: Re: [PATCH V2] arm64/mm: Intercept pfn changes in set_pte_at()
> On Feb 2, 2023, at 18:45, Catalin Marinas <catalin.marinas@....com> wrote:
>
> On Thu, Feb 02, 2023 at 05:51:39PM +0800, Muchun Song wrote:
>>> On Feb 1, 2023, at 20:20, Catalin Marinas <catalin.marinas@....com> wrote:
>>>> Bah, sorry! Catalin reckons it may have been him talking about the vmemmap.
>>>
>>> Indeed. The discussion with Anshuman started from this thread:
>>>
>>> https://lore.kernel.org/all/20221025014215.3466904-1-mawupeng1@huawei.com/
>>>
>>> We already trip over the existing checks even without Anshuman's patch,
>>> though only by chance. We are not setting the software PTE_DIRTY on the
>>> new pte (we don't bother with this bit for kernel mappings).
>>>
>>> Given that the vmemmap ptes are still live when such change happens and
>>> no-one came with a solution to the break-before-make problem, I propose
>>> we revert the arm64 part of commit 47010c040dec ("mm: hugetlb_vmemmap:
>>> cleanup CONFIG_HUGETLB_PAGE_FREE_VMEMMAP*"). We just need this hunk:
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index 27b2592698b0..5263454a5794 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -100,7 +100,6 @@ config ARM64
>>> select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
>>> select ARCH_WANT_FRAME_POINTERS
>>> select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
>>> - select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
>>
>> Maybe it is a little overkill for HVO as it can significantly minimize the
>> overhead of vmemmap on ARM64 servers for some workloads (like qemu, DPDK).
>> So I don't think disabling it is a good approach. Indeed, HVO broke BBM,
>> but the waring does not affect anything since the tail vmemmap pages are
>> supposed to be read-only. So, I suggest skipping warnings if it is the
>> vmemmap address in set_pte_at(). What do you think of?
>
> IIUC, vmemmap_remap_pte() not only makes the pte read-only but also
> changes the output address. Architecturally, this needs a BBM sequence.
> We can avoid going through an invalid pte if we first make the pte
> read-only, TLBI but keeping the same pfn, followed by a change of the
> pfn while keeping the pte readonly. This also assumes that the content
> of the page pointed at by the pte is the same at both old and new pfn.
Right. I think using BBM is to avoid possibly creating multiple TLB entries
for the same address for a extremely short period. But accessing either the
old page or the new page is fine in this case. Is it acceptable for this
special case without using BBM?
Thanks,
Muchun.
Powered by blists - more mailing lists