lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090206005022.GA6803@elte.hu>
Date:	Fri, 6 Feb 2009 01:50:22 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Hugh Dickins <hugh@...itas.com>,
	William Lee Irwin III <wli@...ementarian.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: pud_bad vs pud_bad


* Jeremy Fitzhardinge <jeremy@...p.org> wrote:

> Ingo Molnar wrote:
>> just the act of using PAE was measured to cause multi-percent slowdown 
>> in fork() and exec() latencies, etc. The pagetables are twice as large 
>> so is that really surprising?
>>   
>
> Is there a similar slowdown running the CPU in 32 vs 64 bit mode?  Or does 
> having more/wider registers mitigate it?

Yes, of course there's a slowdown on 64-bit kernels in fork() performance, 
mainly related to pte size.

Here's some numbers done with perfstat. The "fork" binary forks 256 times, 
waits for the child tasks and then exits. It is a 32-bit binary, statically 
linked - i.e. very similar layout and function on both 32-bit and 64-bit 
kernels.

The results (tabulated a bit, average result of 20 runs):

 $ perfstat -e -3,-4,-5 ./fork

  Performance counter stats for './fork':

        32-bit  32-bit-PAE     64-bit
     ---------  ----------  ---------
     27.367537   30.660090  31.542003  task clock ticks     (msecs)

          5785        5810       5751  pagefaults           (events)
           389         388        388  context switches     (events)
             4           4          4  CPU migrations       (events)
     ---------  ----------  ---------
                    +12.0%     +15.2%  overhead

So PAE is 12.0% slower (the overhead of double the pte size and three page 
table levels), and 64-bit is 15.2% slower (the extra overhead of having four 
page table levels added to the overhead of double the pte size).

Larger ptes do not come for free and the 64-bit instructions do not mitigate 
the cachemiss overhead and memory bandwidth cost.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ