lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 Feb 2009 04:03:53 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	Masami Hiramatsu <mhiramat@...hat.com>
Cc:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Peter Zijlstra <peterz@...radead.org>,
	akpm <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Jim Keniston <jkenisto@...ibm.com>
Subject: Re: irq-disabled vs vmap vs text_poke

On Mon, Feb 16, 2009 at 09:00:35PM -0500, Masami Hiramatsu wrote:
> Mathieu Desnoyers wrote:
> >* Nick Piggin (npiggin@...e.de) wrote:
> >>On Mon, Feb 16, 2009 at 10:04:43AM -0500, Masami Hiramatsu wrote:
> >>>>>>>>BTW, what about using map_vm_area() in text_poke() instead of 
> >>>>>>>>vmap()?
> >>>>>>>>Since text_poke() just maps text pages to alias pages temporarily,
> >>>>>>>>I think we don't need to use delayed vunmap().
> >>[...] 
> >>
> >>>Here is the patch which replace v(un)map with (un)map_vm_area.
> >>I don't quite understand the point of this... delayed vunmap() is
> >>just an implementation detail of vmap subsystem. Callers should not
> >>have to care.
> >>
> >
> >AFAIK, map_vm_area/unmap_vm_area is faster than vmap/vunmap.  This is
> >the point of this patch. Masami, could you provide a quick benchmark of
> >text_poke()/seconds before and after this optimization is applied to
> >confirm this ?
> 
> Sure, here is the result of calling text_poke() 2^14 times.
> 
> <Without this patch>
> Total: 3634133356(cycles), 221809(cycles/text_poke)
> Total: 3699532690(cycles), 225801(cycles/text_poke)
> Total: 3249855588(cycles), 198355(cycles/text_poke)
> 
> <With this patch>
> Total: 483467579(cycles), 29508(cycles/text_poke)
> Total: 497441301(cycles), 30361(cycles/text_poke)
> Total: 497604548(cycles), 30371(cycles/text_poke)

Hmm, on bigger SMP systems, I think the global TLB flush required
for unmap_kernel_range will reverse these numbers.

 
> BTW, this is not only for performance, but also simplicity and its need.
> Vmap may allocate new vm_area. However, since text_poke() just needs to
> map pages temporarily (yeah, very short time), we don't want to call
> kmalloc or any other memory allocators.
> And since text_poke() makes WRITABLE aliases of READ-ONLY pages, we
> want to purge these pages ASAP.
> So, I think just reserving a small vm_area for text_poke() and
> reusing it is enough.

It is not a bad idea, but I don't think it quite goes far enough.
IMO we should reserve 2 pages of virtual memory for each CPU, and
then do the mapping/unmapping without locking, and with another
variant of unmap_kernel_range that does not do the global TLB
flush.

Unless performance doesn't really matter much, in which case, I
guess your patch is nice because it avoids doing the allocations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ