[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <50493410-c44c-7ef0-81f9-d4ce9a525c1f@arm.com>
Date: Thu, 23 Jan 2020 13:54:32 +0530
From: Anshuman Khandual <anshuman.khandual@....com>
To: Xuefeng Wang <wxf.wang@...ilicon.com>, catalin.marinas@....com,
will@...nel.org, mark.rutland@....com, arnd@...db.de,
akpm@...ux-foundation.org
Cc: linux-arch@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
chenzhou10@...wei.com
Subject: Re: [PATCH 0/2] mm/thp: rework the pmd protect changing flow
On 01/23/2020 01:25 PM, Xuefeng Wang wrote:
> On KunPeng920 board. When changing permission of a large range region,
> pmdp_invalidate() takes about 65% in profile (with hugepages) in JIT tool.
> Kernel will flush tlb twice: first flush happens in pmdp_invalidate, second
> flush happens at the end of change_protect_range(). The first pmdp_invalidate
> is not necessary if the hardware support atomic pmdp changing. The atomic
> changing pimd to zero can prevent the hardware from update asynchronous.
> So reconstruct it and remove the first pmdp_invalidate. And the second tlb
> flush can make sure the new tlb entry valid.
>
> This patch series add a pmdp_modify_prot transaction abstraction firstly.
> Then add pmdp_modify_prot_start() in arm64, which uses pmdp_huge_get_and_clear()
> to atomically fetch the pmd and zero the entry.
There is a comment section in change_huge_pmd() which details how clearing
the PMD entry there (in prot_numa case) can potentially race with another
concurrent madvise(MADV_DONTNEED, ..) call. Here is the comment block for
reference.
/*
* In case prot_numa, we are under down_read(mmap_sem). It's critical
* to not clear pmd intermittently to avoid race with MADV_DONTNEED
* which is also under down_read(mmap_sem):
*
* CPU0: CPU1:
* change_huge_pmd(prot_numa=1)
* pmdp_huge_get_and_clear_notify()
* madvise_dontneed()
* zap_pmd_range()
* pmd_trans_huge(*pmd) == 0 (without ptl)
* // skip the pmd
* set_pmd_at();
* // pmd is re-established
*
* The race makes MADV_DONTNEED miss the huge pmd and don't clear it
* which may break userspace.
*
* pmdp_invalidate() is required to make sure we don't miss
* dirty/young flags set by hardware.
*/
By defining the new override with pmdp_huge_get_and_clear(), are not we
now exposed to above race condition ?
- Anshuman
Powered by blists - more mailing lists