[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1807202152400.1694@nanos.tec.linutronix.de>
Date: Fri, 20 Jul 2018 21:53:41 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Andy Lutomirski <luto@...nel.org>
cc: Joerg Roedel <joro@...tes.org>, Ingo Molnar <mingo@...nel.org>,
"H . Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Hansen <dave.hansen@...el.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>, Jiri Kosina <jkosina@...e.cz>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Brian Gerst <brgerst@...il.com>,
David Laight <David.Laight@...lab.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
Eduardo Valentin <eduval@...zon.com>,
Greg KH <gregkh@...uxfoundation.org>,
Will Deacon <will.deacon@....com>,
"Liguori, Anthony" <aliguori@...zon.com>,
Daniel Gruss <daniel.gruss@...k.tugraz.at>,
Hugh Dickins <hughd@...gle.com>,
Kees Cook <keescook@...gle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Waiman Long <llong@...hat.com>, Pavel Machek <pavel@....cz>,
"David H . Gutteridge" <dhgutteridge@...patico.ca>,
Joerg Roedel <jroedel@...e.de>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>
Subject: Re: [PATCH 1/3] perf/core: Make sure the ring-buffer is mapped in
all page-tables
On Fri, 20 Jul 2018, Thomas Gleixner wrote:
> On Fri, 20 Jul 2018, Andy Lutomirski wrote:
> > On Fri, Jul 20, 2018 at 12:27 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> > > On Fri, 20 Jul 2018, Andy Lutomirski wrote:
> > >> > On Jul 20, 2018, at 6:22 AM, Joerg Roedel <joro@...tes.org> wrote:
> > >> >
> > >> > From: Joerg Roedel <jroedel@...e.de>
> > >> >
> > >> > The ring-buffer is accessed in the NMI handler, so we better
> > >> > avoid faulting on it. Sync the vmalloc range with all
> > >> > page-tables in system to make sure everyone has it mapped.
> > >> >
> > >> > This fixes a WARN_ON_ONCE() that can be triggered with PTI
> > >> > enabled on x86-32:
> > >> >
> > >> > WARNING: CPU: 4 PID: 0 at arch/x86/mm/fault.c:320 vmalloc_fault+0x220/0x230
> > >> >
> > >> > This triggers because with PTI enabled on an PAE kernel the
> > >> > PMDs are no longer shared between the page-tables, so the
> > >> > vmalloc changes do not propagate automatically.
> > >>
> > >> It seems like it would be much more robust to fix the vmalloc_fault()
> > >> code instead.
> > >
> > > Right, but now the obvious fix for the issue at hand is this. We surely
> > > should revisit this.
> >
> > If you commit this under this reasoning, then please at least make it say:
> >
> > /* XXX: The vmalloc_fault() code is buggy on PTI+PAE systems, and this
> > is a workaround. */
> >
> > Let's not have code in the kernel that pretends to make sense but is
> > actually voodoo magic that works around bugs elsewhere. It's no fun
> > to maintain down the road.
>
> Fair enough. Lemme amend it. Joerg is looking into it, but I surely want to
> get that stuff some exposure in next ASAP.
Delta patch below.
Thanks.
tglx
8<-------------
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -815,8 +815,12 @@ static void rb_free_work(struct work_str
vfree(base);
kfree(rb);
- /* Make sure buffer is unmapped in all page-tables */
- vmalloc_sync_all();
+ /*
+ * FIXME: PAE workaround for vmalloc_fault(): Make sure buffer is
+ * unmapped in all page-tables.
+ */
+ if (IS_ENABLED(CONFIG_X86_PAE))
+ vmalloc_sync_all();
}
void rb_free(struct ring_buffer *rb)
@@ -844,11 +848,13 @@ struct ring_buffer *rb_alloc(int nr_page
goto fail_all_buf;
/*
- * The buffer is accessed in NMI handlers, make sure it is
- * mapped in all page-tables in the system so that we don't
- * fault on the range in an NMI handler.
+ * FIXME: PAE workaround for vmalloc_fault(): The buffer is
+ * accessed in NMI handlers, make sure it is mapped in all
+ * page-tables in the system so that we don't fault on the range in
+ * an NMI handler.
*/
- vmalloc_sync_all();
+ if (IS_ENABLED(CONFIG_X86_PAE))
+ vmalloc_sync_all();
rb->user_page = all_buf;
rb->data_pages[0] = all_buf + PAGE_SIZE;
Powered by blists - more mailing lists