[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1925d301-462d-6b33-8867-4e1646b2dbd6@redhat.com>
Date: Tue, 23 May 2023 11:41:24 +0200
From: David Hildenbrand <david@...hat.com>
To: Yang Yang <yang.yang29@....com.cn>, akpm@...ux-foundation.org
Cc: imbrenda@...ux.ibm.com, jiang.xuexin@....com.cn,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
ran.xiaokai@....com.cn, xu.xin.sc@...il.com, xu.xin16@....com.cn
Subject: Re: [PATCH v8 1/6] ksm: support unsharing KSM-placed zero pages
On 22.05.23 12:49, Yang Yang wrote:
> From: xu xin <xu.xin16@....com.cn>
>
> When use_zero_pages of ksm is enabled, madvise(addr, len, MADV_UNMERGEABLE)
> and other ways (like write 2 to /sys/kernel/mm/ksm/run) to trigger
> unsharing will *not* actually unshare the shared zeropage as placed by KSM
> (which is against the MADV_UNMERGEABLE documentation). As these KSM-placed
> zero pages are out of the control of KSM, the related counts of ksm pages
> don't expose how many zero pages are placed by KSM (these special zero
> pages are different from those initially mapped zero pages, because the
> zero pages mapped to MADV_UNMERGEABLE areas are expected to be a complete
> and unshared page)
>
> To not blindly unshare all shared zero_pages in applicable VMAs, the patch
> use pte_mkdirty (related with architecture) to mark KSM-placed zero pages.
> Thus, MADV_UNMERGEABLE will only unshare those KSM-placed zero pages.
>
> The patch will not degrade the performance of use_zero_pages as it doesn't
> change the way of merging empty pages in use_zero_pages's feature.
>
Maybe add: "We'll reuse this mechanism to reliably identify KSM-placed
zeropages to properly account for them (e.g., calculating the KSM profit
that includes zeropages) next."
> Signed-off-by: xu xin <xu.xin16@....com.cn>
> Suggested-by: David Hildenbrand <david@...hat.com>
> Cc: Claudio Imbrenda <imbrenda@...ux.ibm.com>
> Cc: Xuexin Jiang <jiang.xuexin@....com.cn>
> Reviewed-by: Xiaokai Ran <ran.xiaokai@....com.cn>
> Reviewed-by: Yang Yang <yang.yang29@....com.cn>
> ---
> include/linux/ksm.h | 6 ++++++
> mm/ksm.c | 5 +++--
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index 899a314bc487..7989200cdbb7 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -26,6 +26,9 @@ int ksm_disable(struct mm_struct *mm);
>
> int __ksm_enter(struct mm_struct *mm);
> void __ksm_exit(struct mm_struct *mm);
> +/* use pte_mkdirty to track a KSM-placed zero page */
> +#define set_pte_ksm_zero(pte) pte_mkdirty(pte_mkspecial(pte))
If there is only a single user (which I assume), please inline it instead.
Let's add some more documentation:
/*
* To identify zeropages that were mapped by KSM, we reuse the dirty bit
* in the PTE. If the PTE is dirty, the zeropage was mapped by KSM when
* deduplicating memory.
*/
> +#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte))
>
> static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm)
> {
> @@ -95,6 +98,9 @@ static inline void ksm_exit(struct mm_struct *mm)
> {
> }
>
> +#define set_pte_ksm_zero(pte) pte_mkspecial(pte)
> +#define is_ksm_zero_pte(pte) 0
> +
> #ifdef CONFIG_MEMORY_FAILURE
> static inline void collect_procs_ksm(struct page *page,
> struct list_head *to_kill, int force_early)
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 0156bded3a66..9962f5962afd 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -447,7 +447,8 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex
> if (is_migration_entry(entry))
> page = pfn_swap_entry_to_page(entry);
> }
> - ret = page && PageKsm(page);
> + /* return 1 if the page is an normal ksm page or KSM-placed zero page */
> + ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte);
> pte_unmap_unlock(pte, ptl);
> return ret;
> }
> @@ -1220,7 +1221,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
> page_add_anon_rmap(kpage, vma, addr, RMAP_NONE);
> newpte = mk_pte(kpage, vma->vm_page_prot);
> } else {
> - newpte = pte_mkspecial(pfn_pte(page_to_pfn(kpage),
> + newpte = set_pte_ksm_zero(pfn_pte(page_to_pfn(kpage),
> vma->vm_page_prot));
> /*
> * We're replacing an anonymous page with a zero page, which is
Apart from that LGTM.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists