lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 9 Apr 2007 11:20:12 -0700
From:	William Lee Irwin III <wli@...omorphy.com>
To:	Christoph Lameter <clameter@....com>
Cc:	Andy Whitcroft <apw@...dowen.org>,
	Martin Bligh <mbligh@...gle.com>,
	Dave Hansen <hansendc@...ibm.com>, Andi Kleen <ak@...e.de>,
	linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
	linux-mm@...ck.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [PATCH 1/4] x86_64: (SPARSE_VIRTUAL doubles sparsemem speed)

On Mon, 9 Apr 2007, William Lee Irwin III wrote:
>> Whatever's going on with the rest of this, I really like this
>> instrumentation patch. It may be worthwhile to allow pc_start() to be
>> overridden so things like performance counter MSR's are usable, but
>> the framework looks very useful.

On Mon, Apr 09, 2007 at 10:16:06AM -0700, Christoph Lameter wrote:
> Yeah. I also did some measurements on quicklists on x86_64 and it seems 
> that caching page table pages is also useful:
[...]

Sadly these timings will not address the arguments made on behalf of
eager zeroing, which are essentially that "precharging the cache" will
benefit fork(). I personally still find them meaningful.

I think a demonstration that will deal with more of others' concerns
might be instrumenting fault latency after a fresh execve() and
comparing fork() timings. This is clearly awkward given the scheduling
opportunities in these areas, but a virtual time measurement may suffice
to cope with those.

The fault latency after a fresh execve() is particularly relevant
because it's a scenario where incremental pagetable construction occurs
in "real life." It's actually less common for processes to merely fork()
than to fork() then execve() and then perform most of their work in the
context set up in execve(). The pagetables set up for fork() are most
commonly short-lived and the utility of caching them or even
constructing them so questionable that pagetable sharing and other
methods of avoiding the copy are quite plausible, though perhaps in
need of some heuristics to avoid new faults for the purposes of merely
copying pagetables where possible (AIUI such are considered or
implemented in pagetable sharing patches; of course, Dave McCracken has
far more detailed knowledge of the performance considerations there).

I daresay it's highly dubious to consider pagetable construction for
fork() in isolation when the common case is to almost immediately flush
all those pagetables down the toilet via execve().

Basically, if we can establish that (1) pagetable caching doesn't hurt
fork() or maybe even speeds it up then (2) it speeds up post-execve()
faults, then things look good. Raw timings on the pagetable allocation
primitives are unfortunately too micro to adequately make our case.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists