[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230414142341.354556-10-shiyn.lin@gmail.com>
Date: Fri, 14 Apr 2023 22:23:33 +0800
From: Chih-En Lin <shiyn.lin@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Qi Zheng <zhengqi.arch@...edance.com>,
David Hildenbrand <david@...hat.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Christophe Leroy <christophe.leroy@...roup.eu>,
John Hubbard <jhubbard@...dia.com>,
Nadav Amit <namit@...are.com>, Barry Song <baohua@...nel.org>,
Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>,
Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Yu Zhao <yuzhao@...gle.com>,
Steven Barrett <steven@...uorix.net>,
Juergen Gross <jgross@...e.com>, Peter Xu <peterx@...hat.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Tong Tiangen <tongtiangen@...wei.com>,
Christoph Hellwig <hch@...radead.org>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Yang Shi <shy828301@...il.com>,
Vlastimil Babka <vbabka@...e.cz>,
Alex Sierra <alex.sierra@....com>,
Vincent Whitchurch <vincent.whitchurch@...s.com>,
Anshuman Khandual <anshuman.khandual@....com>,
Li kunyu <kunyu@...china.com>,
Liu Shixin <liushixin2@...wei.com>,
Hugh Dickins <hughd@...gle.com>,
Minchan Kim <minchan@...nel.org>,
Joey Gouly <joey.gouly@....com>,
Chih-En Lin <shiyn.lin@...il.com>,
Michal Hocko <mhocko@...e.com>,
Suren Baghdasaryan <surenb@...gle.com>,
"Zach O'Keefe" <zokeefe@...gle.com>,
Gautam Menghani <gautammenghani201@...il.com>,
Catalin Marinas <catalin.marinas@....com>,
Mark Brown <broonie@...nel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Andrei Vagin <avagin@...il.com>,
Shakeel Butt <shakeelb@...gle.com>,
Daniel Bristot de Oliveira <bristot@...nel.org>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Alexey Gladkov <legion@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-trace-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
Dinglan Peng <peng301@...due.edu>,
Pedro Fonseca <pfonseca@...due.edu>,
Jim Huang <jserv@...s.ncku.edu.tw>,
Huichun Feng <foxhoundsk.tw@...il.com>
Subject: [PATCH v5 09/17] mm/madvise: Handle COW-ed PTE with madvise()
Break COW PTE if madvise() modify the pte entry of COW-ed PTE.
Following are the list of flags which need to break COW PTE. However,
like MADV_HUGEPAGE and MADV_MERGEABLE, we should handle it respectively.
- MADV_DONTNEED: It calls to zap_page_range() which already be handled.
- MADV_FREE: It uses walk_page_range() with madvise_free_pte_range() to
free the page by itself, so add break_cow_pte().
- MADV_REMOVE: Same as MADV_FREE, it remove the page by itself, so add
break_cow_pte_range().
- MADV_COLD: Similar to MAD_FREE, break COW PTE before pageout.
- MADV_POPULATE: Let GUP deal with it.
Signed-off-by: Chih-En Lin <shiyn.lin@...il.com>
---
mm/madvise.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mm/madvise.c b/mm/madvise.c
index 340125d08c03..71176edb751e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -425,6 +425,9 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
if (pmd_trans_unstable(pmd))
return 0;
#endif
+ if (break_cow_pte(vma, pmd, addr))
+ return 0;
+
tlb_change_page_size(tlb, PAGE_SIZE);
orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
flush_tlb_batched_pending(mm);
@@ -625,6 +628,10 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
+ /* We should only allocate PTE. */
+ if (break_cow_pte(vma, pmd, addr))
+ goto next;
+
tlb_change_page_size(tlb, PAGE_SIZE);
orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
flush_tlb_batched_pending(mm);
@@ -984,6 +991,12 @@ static long madvise_remove(struct vm_area_struct *vma,
if ((vma->vm_flags & (VM_SHARED|VM_WRITE)) != (VM_SHARED|VM_WRITE))
return -EACCES;
+ error = break_cow_pte_range(vma, start, end);
+ if (error < 0)
+ return error;
+ else if (error > 0)
+ return -ENOMEM;
+
offset = (loff_t)(start - vma->vm_start)
+ ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
--
2.34.1
Powered by blists - more mailing lists