linux-kernel - Re: lmbench lat_mmap slowdown with CONFIG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 23 Jan 2009 00:04:23 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Nick Piggin <npiggin@...e.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>, hpa@...or.com,
	jeremy@...source.com, chrisw@...s-sol.org, zach@...are.com,
	rusty@...tcorp.com.au
Subject: Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT


* Jeremy Fitzhardinge <jeremy@...p.org> wrote:

> Ingo Molnar wrote:
>> Ouch, that looks unacceptably expensive. All the major distros turn  
>> CONFIG_PARAVIRT on. paravirt_ops was introduced in x86 with the express 
>> promise to have no measurable runtime overhead.
>>
>> ( And i suspect the real life mmap cost is probably even more expensive,
>>   as on a Barcelona all of lmbench fits into the cache hence we dont see
>>   any real $cache overhead. )
>>
>> Jeremy, any ideas where this slowdown comes from and how it could be  
>> fixed?
>>   
>
> I just posted a couple of patches to pick some low-hanging fruit.  It 
> turns out that we don't need to do any pvops calls to do pte flag 
> manipulations.  I'd be interested to see how much of a difference it 
> makes (it reduces the static code size by a few k).

I've tried your patches - but can see no significant reduction in 
overhead. I've updated my table with numbers from your patches:

 -----------------------------------------------
 | Performance counter stats for './mmap-perf' |
 -----------------------------------------------
 |            |            |
 | defconfig  | PARAVIRT=y |    +Jeremy
 |-----------------------------------------------------------------------
 |
 | 1311.55452 | 1360.62493 | 1378.94464  task clock (msecs)        +3.74%
 |            |            |
 |          1 |          1 |          0  CPU migrations
 |         91 |         79 |         77  context switches
 |      55945 |      55943 |      55980  pagefaults
 |.......................................................................
 | 3781392474 | 3918777174 | 3907189795  CPU cycles                +3.63%
 | 1957153827 | 2161280486 | 2161741689  instructions             +10.43%
 |   50234816 |   51303520 |   50619593  cache references          +2.12%
 |    5428258 |    5583728 |    5575808  cache misses              +2.86%
 |
 |  437983499 |  478967061 |  479053595  branches                  +9.36%
 |   32486067 |   32336874 |   32377710  branch-misses             -0.46%
 |            |
 | 1314.78246 | 1363.69444 | 1357.58161  time elapsed (msecs)       +3.72%
 |            |
 ------------------------------------------------------------------------

'+Jeremy' is a CONFIG_PARAVIRT=y run done with your patches.

The most stable count is the instruction count:

 | 1957153827 | 2161280486 | 2161741689  instructions             +10.43%

But your two patches did not reduce the instruction count in any 
measurable way.

In any case, it is rather inefficient of me proxy-testing your patches, 
you can do these measurements yourself too on any Core2 or later Intel 
CPU, by running tip/master plus picking up these two utilities:

    http://people.redhat.com/mingo/perfcounters/perfstat.c
    http://redhat.com/~mingo/misc/mmap-perf.c

building them and running this (as root):

    taskset 1 ./perfstat ./mmap-perf 1

it will give you numbers like the ones above.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/