linux-kernel - Re: [patch 10/10] Scheduler profiling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20070705132120.8edbc1f3.akpm@linux-foundation.org>
Date:	Thu, 5 Jul 2007 13:21:20 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	Alexey Dobriyan <adobriyan@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [patch 10/10] Scheduler profiling - Use immediate values

On Tue, 3 Jul 2007 14:57:48 -0400
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:

> Measuring the overall impact on the system of this single modification
> results in the difference brought by one site within the standard
> deviation of the normal samples. It will become significant when the
> number of immediate values used instead of global variables at hot
> kernel paths (need to ponder with the frequency at which the data is
> accessed) will start to be significant compared to the L1 data cache
> size. We could characterize this in memory to L1 cache transfers per
> seconds.
> 
> On 3GHz P4:
> 
> memory read: ~48 cycles
> 
> So we can definitely say that 48*HZ (approximation of the frequency at
> which the scheduler is called) won't make much difference, but as it
> grows, it will.
> 
> On a 1000HZ system, it results in:
> 
> 48000 cycles/second, or 16__s/second, or 0.000016% speedup.
> 
> However, if we place this in code called much more often, such as
> do_page_fault, we get, with an hypotetical scenario of approximation
> of 100000 page faults per second:
> 
> 4800000 cycles/s, 1.6ms/second or 0.0016% speedup.
> 
> So as the number of immediate values used increase, the overall memory
> bandwidth required by the kernel will go down.

Is that 48 cycles measured when the target of the read is in L1 cache, as
it would be in any situation which we actually care about?  I guess so...

Boy, this is a tiny optimisation and boy, you added a pile of tricky new
code to obtain it.

Frankly, I'm thinking that life would be simpler if we just added static
markers and stopped trying to add lots of tricksy
maintenance-load-increasing things like this.

Ho hum.  Need more convincing, please.

Also: a while back (maybe as much as a year) we had an extensive discussion
regarding whether we want static markers at all in the kernel.  The
eventual outcome was, I believe, "yes".

But our reasons for making that decision appear to have been lost.  So if I
were to send the markers patches to Linus and he were to ask me "why are
you sending these", I'd be forced to answer "I don't know".  This is not a
good situation.

Please prepare and maintain a short document which describes the
justification for making all these changes to the kernel.  The changelog
for the main markers patch wold be an appropriate place for this.  The
target audience would be kernel developers and it should capture the pro-
and con- arguments which were raised during that discussion.

Bascially: tell us why we should merge _any_ of this stuff, because I for
one have forgotten.  Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/