lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 1 May 2020 12:16:08 +0200
From:   Joerg Roedel <jroedel@...e.de>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shile Zhang <shile.zhang@...ux.alibaba.com>,
        Andy Lutomirski <luto@...capital.net>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Tzvetomir Stoyanov <tz.stoyanov@...il.com>
Subject: Re: [RFC][PATCH] x86/mm: Sync all vmalloc mappings before text_poke()

On Thu, Apr 30, 2020 at 10:39:19PM -0400, Steven Rostedt wrote:
> I'll give the answer I gave to Joerg when he replied to my accidental
> private (not public) email:
> 
> Or even my original patch would be better than having the generic tracing
> code understanding the intrinsic properties of vmalloc() and
> alloc_percpu() on x86_64. I really don't think it is wise to have:
> 
> 	foo = alloc_percpu();
> 
> 	/*
> 	 * Because of some magic with the way alloc_percpu() works on
> 	 * x86_64, we need to synchronize the pgd of all the tables,
> 	 * otherwise the trace events that happen in x86_64 page fault
> 	 * handlers can't cope with accessing the chance that a
> 	 * alloc_percpu()'d memory might be touched in the page fault trace
> 	 * event. Oh, and we need to audit all alloc_percpu() and vmalloc()
> 	 * calls in tracing, because something might get triggered within a
> 	 * page fault trace event!
> 	 */
> 	vmalloc_sync_mappings();
> 
> That would be exactly what I add as a comment if it were to be added in the
> generic tracing code.
> 
> And we would need to audit any percpu alloc'd code in all tracing, or
> anything that might git hooked into something that hooks to the page fault
> trace point.
> 
> Since this worked for a decade without this, I'm strongly against adding it
> in the generic code due to some issues with a single architecture.

That is exactly the problem with vmalloc_sync_mappings()/unmappings().
It is not at all clear when it needs to be called and why, or even who
needs is responsible for calling it. The existing call-sites in Notifier
and ACPI code have no comment on why it is necessary to synchronize the
vmalloc mappings there.

It is only needed for x86, we could also get rid of it completely if:

	1) At x86-64 we pre-allocate all 64 P4D/PUD pages for the
	   vmalloc area in init_mm at boot time. This needs 256kb of
	   memory per system, most of it potentially unused as each
	   P4D/PUD maps 512GB of address space.

	2) At x86-32 we need to disable large pages for vmalloc/ioremap
	   mappings and pre-allocate the PTE pages for the vmalloc area
	   in init_mm. Depending on how much memory the system has and
	   the configured kernel/user split this might take more than 64
	   pages.

With that we could get rid of the vmalloc_sync interface and also the
vmalloc-fault code in general and reduce the complexity. This interface
has caused problems more than once. On the other side it would trade
memory usage against complexity.

Regards,

	Joerg

Powered by blists - more mailing lists