lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Apr 2008 20:13:13 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andi Kleen <andi@...stfloor.org>, Jiri Slaby <jirislaby@...il.com>,
	David Miller <davem@...emloft.net>, zdenek.kabelac@...il.com,
	rjw@...k.pl, paulmck@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
	linux-ext4@...r.kernel.org, herbert@...dor.apana.org.au,
	penberg@...helsinki.fi, clameter@....com,
	linux-kernel@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	pageexec@...email.hu, "H. Peter Anvin" <hpa@...or.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>
Subject: Re: [PATCH 1/1] x86: fix text_poke


* Ingo Molnar <mingo@...e.hu> wrote:

> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> 
> > > performance i dont think we should be too worried about at this 
> > > moment - this code is so rarely used that it should be driven by 
> > > robustness i think.
> > 
> > That really isn't true. This isn't done just once. It's done many 
> > thousands of times.
> > 
> > I agree that it has to be robust, but if we want to make 
> > suspend/resume be instantaneous (and we do), performance does 
> > actually matter. Yes, this is probably much less of a problem than 
> > waiting for devices, and no, I haven't timed it, but if I counted 
> > right, we'll literally be going almost ten thousand of these calls 
> > over a suspend/resume cycle.
> > 
> > That's not "rarely used".
> 
> yeah, it's done 2800 times on my box with a distro .config.
> 
> no strong feeling either way - but i dont think there's any cross-CPU 
> TLB flush done in this case within vmap()/vunmap(). Why? Because when 
> alternative_instructions() runs then we have just a single CPU in 
> cpu_online_map.
> 
> So i think it's only direct vmap()/vunmap() overhead, on a single CPU. 
> We do a kmalloc/kfree which is rather fast - sub-microsecond. We 
> install the pages in the pte's - this is rather fast as well - 
> sub-microsecond. Even assuming cache-cold lines (which they are most 
> of the time) and taken thousands of times that's at most a few 
> milliseconds IMO.
> 
> In fact, most of the actual vmap() related overhead should be 
> well-cached (the kmalloc bits) - the main cost should come from 
> trashing through all the instruction sites and modifying them.

i just did some direct measurements of alternatives_smp_switch() itself:

 alternatives took: 7374 usecs
 alternatives took: 8775 usecs
 alternatives took: 7498 usecs
 alternatives took: 8776 usecs

that's on a ~2GHz Athlon64 X2 - so not the latest hw.

i also added a sysctl to turn alternatives patching on/off, and the CPU 
offline+online cycle:

  # alternatives on:
  real    0m0.152s
  real    0m0.172s

  # alternatives off:
  real    0m0.146s
  real    0m0.168s

so it's measurable and it is in the few milliseconds range. (But there 
seems to be strong dependency on the kernel image layout or some other 
detail - compare these timings to my previous timings - they were 
radically different.)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ