[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190131125908.GB31552@hirez.programming.kicks-ass.net>
Date: Thu, 31 Jan 2019 13:59:08 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: kan.liang@...ux.intel.com
Cc: acme@...nel.org, tglx@...utronix.de, mingo@...hat.com,
linux-kernel@...r.kernel.org, eranian@...gle.com, jolsa@...hat.com,
namhyung@...nel.org, ak@...ux.intel.com,
Andy Lutomirski <luto@...capital.net>
Subject: Re: [PATCH V3 01/13] perf/core, x86: Add PERF_SAMPLE_DATA_PAGE_SIZE
On Thu, Jan 31, 2019 at 01:37:25PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 30, 2019 at 06:23:42AM -0800, kan.liang@...ux.intel.com wrote:
> > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > index 374a197..03bf45d 100644
> > --- a/arch/x86/events/core.c
> > +++ b/arch/x86/events/core.c
> > @@ -2578,3 +2578,45 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
> > cap->events_mask_len = x86_pmu.events_mask_len;
> > }
> > EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
> > +
> > +/*
> > + * map x86 page levels to perf page sizes
> > + */
> > +static const enum perf_page_size perf_page_size_map[PG_LEVEL_NUM] = {
> > + [PG_LEVEL_NONE] = PERF_PAGE_SIZE_NONE,
> > + [PG_LEVEL_4K] = PERF_PAGE_SIZE_4K,
> > + [PG_LEVEL_2M] = PERF_PAGE_SIZE_2M,
> > + [PG_LEVEL_1G] = PERF_PAGE_SIZE_1G,
> > + [PG_LEVEL_512G] = PERF_PAGE_SIZE_512G,
> > +};
> > +
> > +u64 perf_get_page_size(u64 virt)
> > +{
> > + unsigned long flags;
> > + unsigned int level;
> > + pte_t *pte;
> > +
> > + if (!virt)
> > + return 0;
> > +
> > + /*
> > + * Interrupts are disabled, so it prevents any tear down
> > + * of the page tables.
> > + * See the comment near struct mmu_table_batch.
> > + */
> > + local_irq_save(flags);
> > + if (virt >= TASK_SIZE)
> > + pte = lookup_address(virt, &level);
> > + else {
> > + if (current->mm)
> > + pte = lookup_address_in_pgd(pgd_offset(current->mm, virt),
> > + virt, &level);
>
> Aside from all the missin {}, I'm fairly sure this is broken since this
> happens from NMI context. This can interrupt switch_mm() and things like
> use_temporary_mm().
Ah, I'm confused again. This is a software page-table walk and is not
affected by the current CR3 state, which is much safer.
The rest of the comment still apply of course.
Powered by blists - more mailing lists