[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220124140622.GA2955982@u2004>
Date: Mon, 24 Jan 2022 23:06:22 +0900
From: Naoya Horiguchi <naoya.horiguchi@...ux.dev>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: linux-mm@...ck.org, Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Zi Yan <ziy@...dia.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
John Hubbard <jhubbard@...dia.com>,
Matthew Wilcox <willy@...radead.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/migration: Add trace events for THP migrations
On Fri, Jan 21, 2022 at 12:08:14PM +0530, Anshuman Khandual wrote:
>
>
> On 1/11/22 12:27 PM, Naoya Horiguchi wrote:
> > On Tue, Jan 11, 2022 at 10:31:21AM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 1/11/22 7:28 AM, Naoya Horiguchi wrote:
> >>> Hi Anshuman,
> >>>
> >>> On Fri, Jan 07, 2022 at 10:29:35AM +0530, Anshuman Khandual wrote:
> >>>> This adds two trace events for PMD based THP migration without split. These
> >>>> events closely follow the implementation details like setting and removing
> >>>> of PMD migration entries, which are essential operations for THP migration.
> >>>
> >>> I often want to check which individual pages are migrated to which places
> >>> (or not migrated) for testing, so these new tracepoints could help me.
> >>> Maybe these can be much greater if they can handle other types of page
> >>> migration for raw pages and hugetlb pages. Is it hard to cover all such
> >>> page migration events?
> >>
> >> Are you suggesting to cover all migration entry transitions for normal
> >> and HugeTLB pages as well ?
> >
> > Yes if you like the idea. I think that some events listed below can be grouped
> > into one tracepoint event with showing args like pgsize or read/write flags
> > (or implementation detail is up to you).
>
> In its simplest form, something like this will work ? Although the THP migration
> patch still remains (almost) unchanged.
>
> include/trace/events/migrate.h | 38 ++++++++++++++++++++++++++++++++++
> mm/migrate.c | 10 +++++++--
> mm/rmap.c | 3 +++
> 3 files changed, 49 insertions(+), 2 deletions(-)
>
> diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h
> index 779f3fad9ecd..b66652fcc8af 100644
> --- a/include/trace/events/migrate.h
> +++ b/include/trace/events/migrate.h
> @@ -105,6 +105,44 @@ TRACE_EVENT(mm_migrate_pages_start,
> __print_symbolic(__entry->reason, MIGRATE_REASON))
> );
>
> +TRACE_EVENT(set_migration_pte,
> +
> + TP_PROTO(unsigned long addr, unsigned long pte),
> +
> + TP_ARGS(addr, pte),
> +
> + TP_STRUCT__entry(
> + __field(unsigned long, addr)
> + __field(unsigned long, pte)
> + ),
> +
> + TP_fast_assign(
> + __entry->addr = addr;
> + __entry->pte = pte;
> + ),
> +
> + TP_printk("create pte migration entry addr=%lx, pte=%lx", __entry->addr, __entry->pte)
> +);
> +
> +TRACE_EVENT(remove_migration_ptes,
> +
> + TP_PROTO(struct page *old_page, struct page *new_page),
> +
> + TP_ARGS(old_page, new_page),
> +
> + TP_STRUCT__entry(
> + __field(struct page *, old_page)
> + __field(struct page *, new_page)
> + ),
> +
> + TP_fast_assign(
> + __entry->old_page = old_page;
> + __entry->new_page = new_page;
> + ),
> +
> + TP_printk("remove pte migration entry old_page=%lx new_page=%lx", __entry->old_page, __entry->new_page);
Thank you for the work, Anshuman. I have a few comments:
- the format string in TP_printk has prefix like "remove pte migration entry",
but it seems to me redundant because each trace event name is also printed
out like below:
bash-2611 [001] 272.952419: mm_migrate_pages_start: mode=MIGRATE_SYNC reason=mempolicy_mbind
bash-2611 [001] 272.952439: set_migration_pte: create pte migration entry addr=700000000000, pte=dfffffffdf478602
bash-2611 [001] 272.952466: remove_migration_ptes: remove pte migration entry old_pfn=0x105c3c new_pfn=0x15e8d8
Maybe trace events for THP migration can be more compact too.
- TP_ARGS for remove_migration_ptes() are inconsistent with those of
set_migration_pte and {set,remove}_migration_pmd, so how about putting
trace_remove_migration_pte() within remove_migration_ptes() instead of
after remove_migration_ptes().
- I think page order is also important to tell hugetlb migration events
from normal page migration events.
I borrowed your diff and made testing patch like below. Feel free to
use/update it.
Thanks,
Naoya Horiguchi
---
diff --git a/include/trace/events/migrate.h b/include/trace/events/migrate.h
index 779f3fad9ecd..f025e80bd5b9 100644
--- a/include/trace/events/migrate.h
+++ b/include/trace/events/migrate.h
@@ -105,6 +105,48 @@ TRACE_EVENT(mm_migrate_pages_start,
__print_symbolic(__entry->reason, MIGRATE_REASON))
);
+TRACE_EVENT(set_migration_pte,
+
+ TP_PROTO(unsigned long addr, unsigned long pte, int order),
+
+ TP_ARGS(addr, pte, order),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, addr)
+ __field(unsigned long, pte)
+ __field(int, order)
+ ),
+
+ TP_fast_assign(
+ __entry->addr = addr;
+ __entry->pte = pte;
+ __entry->order = order;
+ ),
+
+ TP_printk("addr=%lx, pte=%lx, order=%d", __entry->addr, __entry->pte, __entry->order)
+);
+
+TRACE_EVENT(remove_migration_pte,
+
+ TP_PROTO(unsigned long addr, unsigned long pte, int order),
+
+ TP_ARGS(addr, pte, order),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, addr)
+ __field(unsigned long, pte)
+ __field(int, order)
+ ),
+
+ TP_fast_assign(
+ __entry->addr = addr;
+ __entry->pte = pte;
+ __entry->order = order;
+ ),
+
+ TP_printk("addr=%lx, pte=%lx, order=%d", __entry->addr, __entry->pte, __entry->order)
+);
+
#endif /* _TRACE_MIGRATE_H */
/* This part must be outside protection */
diff --git a/mm/migrate.c b/mm/migrate.c
index c7da064b4781..0f9fb8813301 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -257,6 +257,8 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma,
if (PageTransHuge(page) && PageMlocked(page))
clear_page_mlock(page);
+ trace_remove_migration_pte(pvmw.address, pte_val(pte), compound_order(new));
+
/* No need to invalidate - it was non-present before */
update_mmu_cache(vma, pvmw.address, pvmw.pte);
}
diff --git a/mm/rmap.c b/mm/rmap.c
index b0fd9dc19eba..e089cbb9ef97 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -77,6 +77,7 @@
#include <asm/tlbflush.h>
#include <trace/events/tlb.h>
+#include <trace/events/migrate.h>
#include "internal.h"
@@ -1872,6 +1873,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
if (pte_swp_uffd_wp(pteval))
swp_pte = pte_swp_mkuffd_wp(swp_pte);
set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte);
+ trace_set_migration_pte(pvmw.address, pte_val(swp_pte), compound_order(page));
/*
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
@@ -1940,6 +1942,7 @@ static bool try_to_migrate_one(struct page *page, struct vm_area_struct *vma,
if (pte_uffd_wp(pteval))
swp_pte = pte_swp_mkuffd_wp(swp_pte);
set_pte_at(mm, address, pvmw.pte, swp_pte);
+ trace_set_migration_pte(address, pte_val(swp_pte), compound_order(page));
/*
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
--
2.25.1
Powered by blists - more mailing lists