lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506111704.5a97a3e2@gandalf.local.home>
Date:   Wed, 6 May 2020 11:17:04 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Joerg Roedel <jroedel@...e.de>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shile Zhang <shile.zhang@...ux.alibaba.com>,
        Andy Lutomirski <luto@...capital.net>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Tzvetomir Stoyanov <tz.stoyanov@...il.com>
Subject: Re: [PATCH] tracing: Call vmalloc_sync_mappings() after
 alloc_percpu()

On Tue, 5 May 2020 14:31:44 +0200
Joerg Roedel <jroedel@...e.de> wrote:

> On Mon, May 04, 2020 at 03:10:06PM -0400, Steven Rostedt wrote:
> > I'm fine with adding it to the tracing code (with that ridiculous
> > comment! ;-)
> > 
> > I'll even tag is as stable, but again, it's uncertain what commit that it
> > "fixes".  
> 
> Okay, so here it is. It fixes the issue for me and doesn't cause any
> lockdep splats on my machine. I am not sure how far this needs to be
> backported via stable, so I didn't specify it. Same with a Fixes tag.
> Oh, and I added your comment :-) (which makes up most of the added
> lines, so feel free to change authorship to you).

OK, I did take authorship. At least it will be me who gets the blame ;-)

Here's the patch:

>From 685fc0b545f39bf9bfafa0396113c852ef3c81fc Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt@...dmis.org>
Date: Wed, 6 May 2020 10:36:18 -0400
Subject: [PATCH] tracing: Add a vmalloc_sync_mappings() for safe measure

x86_64 lazily maps in the vmalloc pages, and the way this works with per_cpu
areas can be complex, to say the least. Mappings may happen at boot up, and
if nothing synchronizes the page tables, those page mappings may not be
synced till they are used. This causes issues for anything that might touch
one of those mappings in the path of the page fault handler. When one of
those unmapped mappings is touched in the page fault handler, it will cause
another page fault, which in turn will cause a page fault, and leave us in
a loop of page faults.

Commit 763802b53a42 ("x86/mm: split vmalloc_sync_all()") split
vmalloc_sync_all() into vmalloc_sync_unmappings() and
vmalloc_sync_mappings(), as on system exit, it did not need to do a full
sync on x86_64 (although it still needed to be done on x86_32). By chance,
the vmalloc_sync_all() would synchronize the page mappings done at boot up
and prevent the per cpu area from being a problem for tracing in the page
fault handler. But when that synchronization in the exit of a task became a
nop, it caused the problem to appear.

Link: https://lore.kernel.org/r/20200429054857.66e8e333@oasis.local.home

Cc: stable@...r.kernel.org
Fixes: 737223fbca3b1 ("tracing: Consolidate buffer allocation code")
Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@...il.com>
Suggested-by: Joerg Roedel <jroedel@...e.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@...dmis.org>
---
 kernel/trace/trace.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8d2b98812625..9ed6d92768af 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -8525,6 +8525,19 @@ static int allocate_trace_buffers(struct trace_array *tr, int size)
 	 */
 	allocate_snapshot = false;
 #endif
+
+	/*
+	 * Because of some magic with the way alloc_percpu() works on
+	 * x86_64, we need to synchronize the pgd of all the tables,
+	 * otherwise the trace events that happen in x86_64 page fault
+	 * handlers can't cope with accessing the chance that a
+	 * alloc_percpu()'d memory might be touched in the page fault trace
+	 * event. Oh, and we need to audit all other alloc_percpu() and vmalloc()
+	 * calls in tracing, because something might get triggered within a
+	 * page fault trace event!
+	 */
+	vmalloc_sync_mappings();
+
 	return 0;
 }
 
-- 
2.20.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ