[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200604152336.4826-1-sjpark@amazon.com>
Date: Thu, 4 Jun 2020 17:23:36 +0200
From: SeongJae Park <sjpark@...zon.com>
To: David Hildenbrand <david@...hat.com>
CC: SeongJae Park <sjpark@...zon.com>, <akpm@...ux-foundation.org>,
"SeongJae Park" <sjpark@...zon.de>, <Jonathan.Cameron@...wei.com>,
<aarcange@...hat.com>, <acme@...nel.org>,
<alexander.shishkin@...ux.intel.com>, <amit@...nel.org>,
<benh@...nel.crashing.org>, <brendan.d.gregg@...il.com>,
<brendanhiggins@...gle.com>, <cai@....pw>,
<colin.king@...onical.com>, <corbet@....net>, <dwmw@...zon.com>,
<foersleo@...zon.de>, <irogers@...gle.com>, <jolsa@...hat.com>,
<kirill@...temov.name>, <mark.rutland@....com>, <mgorman@...e.de>,
<minchan@...nel.org>, <mingo@...hat.com>, <namhyung@...nel.org>,
<peterz@...radead.org>, <rdunlap@...radead.org>,
<riel@...riel.com>, <rientjes@...gle.com>, <rostedt@...dmis.org>,
<sblbir@...zon.com>, <shakeelb@...gle.com>, <shuah@...nel.org>,
<sj38.park@...il.com>, <snu@...zon.de>, <vbabka@...e.cz>,
<vdavydov.dev@...il.com>, <yang.shi@...ux.alibaba.com>,
<ying.huang@...el.com>, <linux-damon@...zon.com>,
<linux-mm@...ck.org>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: Re: [RFC v2 7/9] mm/damon: Implement callbacks for physical memory monitoring
On Thu, 4 Jun 2020 16:58:13 +0200 David Hildenbrand <david@...hat.com> wrote:
> On 04.06.20 09:26, SeongJae Park wrote:
> > On Wed, 3 Jun 2020 18:09:21 +0200 David Hildenbrand <david@...hat.com> wrote:
> >
> >> On 03.06.20 16:11, SeongJae Park wrote:
> >>> From: SeongJae Park <sjpark@...zon.de>
> >>>
> >>> This commit implements the four callbacks (->init_target_regions,
> >>> ->update_target_regions, ->prepare_access_check, and ->check_accesses)
> >>> for the basic access monitoring of the physical memory address space.
> >>> By setting the callback pointers to point those, users can easily
> >>> monitor the accesses to the physical memory.
> >>>
> >>> Internally, it uses the PTE Accessed bit, as similar to that of the
> >>> virtual memory support. Also, it supports only page frames that
> >>> supported by idle page tracking. Acutally, most of the code is stollen
> >>> from idle page tracking. Users who want to use other access check
> >>> primitives and monitor the frames that not supported with this
> >>> implementation could implement their own callbacks on their own.
> >>>
> >>> Signed-off-by: SeongJae Park <sjpark@...zon.de>
> >>> ---
> >>> include/linux/damon.h | 5 ++
> >>> mm/damon.c | 184 ++++++++++++++++++++++++++++++++++++++++++
> >>> 2 files changed, 189 insertions(+)
> >>>
> >>> diff --git a/include/linux/damon.h b/include/linux/damon.h
> >>> index 1a788bfd1b4e..f96503a532ea 100644
> >>> --- a/include/linux/damon.h
> >>> +++ b/include/linux/damon.h
> >>> @@ -216,6 +216,11 @@ void kdamond_update_vm_regions(struct damon_ctx *ctx);
> >>> void kdamond_prepare_vm_access_checks(struct damon_ctx *ctx);
> >>> unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx);
> >>>
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_update_phys_regions(struct damon_ctx *ctx);
> >>> +void kdamond_prepare_phys_access_checks(struct damon_ctx *ctx);
> >>> +unsigned int kdamond_check_phys_accesses(struct damon_ctx *ctx);
> >>> +
> >>> int damon_set_pids(struct damon_ctx *ctx, int *pids, ssize_t nr_pids);
> >>> int damon_set_attrs(struct damon_ctx *ctx, unsigned long sample_int,
> >>> unsigned long aggr_int, unsigned long regions_update_int,
> >>> diff --git a/mm/damon.c b/mm/damon.c
> >>> index f5cbc97a3bbc..6a5c6d540580 100644
> >>> --- a/mm/damon.c
> >>> +++ b/mm/damon.c
> >>> @@ -19,7 +19,9 @@
> >>> #include <linux/mm.h>
> >>> #include <linux/module.h>
> >>> #include <linux/page_idle.h>
> >>> +#include <linux/pagemap.h>
> >>> #include <linux/random.h>
> >>> +#include <linux/rmap.h>
> >>> #include <linux/sched/mm.h>
> >>> #include <linux/sched/task.h>
> >>> #include <linux/slab.h>
> >>> @@ -480,6 +482,11 @@ void kdamond_init_vm_regions(struct damon_ctx *ctx)
> >>> }
> >>> }
> >>>
> >>> +/* Do nothing. Users should set the initial regions by themselves */
> >>> +void kdamond_init_phys_regions(struct damon_ctx *ctx)
> >>> +{
> >>> +}
> >>> +
> >>> static void damon_mkold(struct mm_struct *mm, unsigned long addr)
> >>> {
> >>> pte_t *pte = NULL;
> >>> @@ -611,6 +618,178 @@ unsigned int kdamond_check_vm_accesses(struct damon_ctx *ctx)
> >>> return max_nr_accesses;
> >>> }
> >>>
> >>> +/* access check functions for physical address based regions */
> >>> +
> >>> +/* This code is stollen from page_idle.c */
> >>> +static struct page *damon_phys_get_page(unsigned long pfn)
> >>> +{
> >>> + struct page *page;
> >>> + pg_data_t *pgdat;
> >>> +
> >>> + if (!pfn_valid(pfn))
> >>> + return NULL;
> >>> +
> >>
> >> Who provides these pfns? Can these be random pfns, supplied unchecked by
> >> user space? Or are they at least mapped into some user space process?
> >
> > Your guess is right, users can give random physical address and that will be
> > translated into pfn.
> >
>
> Note the difference to idle tracking: "Idle page tracking only considers
> user memory pages", this is very different to your use case. Note that
> this is why there is no pfn_to_online_page() check in page idle code.
My use case is same to that of idle page. I also ignore non-user pages.
Actually, this function is for filtering of the non-user pages, which is simply
stollen from the page_idle.
>
> >>
> >> IOW, do we need a pfn_to_online_page() to make sure the memmap even was
> >> initialized?
> >
> > Thank you for pointing out this! I will use it in the next spin. Also, this
> > code is stollen from page_idle_get_page(). Seems like it should also be
> > modified to use it. I will send the patch for it, either.
>
> pfn_to_online_page() will only succeed for system RAM pages, not
> dax/pmem (ZONE_DEVICE). dax/pmem needs special care.
>
> I can spot that you are taking references to random struct pages. This
> looks dangerous to me and might mess in complicated ways with page
> migration/isolation/onlining/offlining etc. I am not sure if we want that.
AFAIU, page_idle users can also pass random pfns by randomly accessing the
bitmap file. Am I missing something?
Thanks,
SeongJae Park
>
> --
> Thanks,
>
> David / dhildenb
Powered by blists - more mailing lists