[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080415143928.GA24556@Krystal>
Date: Tue, 15 Apr 2008 10:39:28 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: Ingo Molnar <mingo@...e.hu>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>, prasad@...ux.vnet.ibm.com,
linux-kernel@...r.kernel.org, tglx@...utronix.de,
Christoph Hellwig <hch@...radead.org>,
"Frank Ch. Eigler" <fche@...hat.com>
Subject: Re: [RFC PATCH 1/2] Marker probes in futex.c
* Ingo Molnar (mingo@...e.hu) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:
>
> > * Ingo Molnar (mingo@...e.hu) wrote:
> > >
> > > * Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > >
> > > > See, these tracer tools are my nightmare as member of an
> > > > enterprise linux team. They'll make an already hard job even
> > > > harder, no thanks!
> > >
> > > i'm clearly NAK-ing all futex trace hooks until the true impact of
> > > the whole marker facility is better understood. I've tried them for
> > > the scheduler and they were a clear failure: too bloated and too
> > > persistent.
> >
> > I have not seen any counter argument for the in-depth analysis of the
> > instruction cache impact of the optimized markers I've done. Arguing
> > that the markers are "bloated" based only on "size kernel/sched.o"
> > output is a bit misleading.
>
> uhm, i'm not sure what you mean - how else would you quantify bloat than
> to look at the size of the affected subsystem?
>
> Ingo
Data cache bloat inspection :
If you use the "size" output, it will take into account all the data
placed in special sections. At link time, these sections are put
together far from the actual cache hot kernel data.
Instruction cache bloat inspection :
If a code region is placed with cache cold instructions (unlikely
branches), it should not increase the cache impact, since although we
might use one more cache line, it won't be often loaded in cache because
all the code that shares this cache line is unlikely.
TLB entries bloat :
If code is added in unlikely branches, the instruction size increase
could increase the number of TLB entries required to keep cache hot
code. However, in our case, adding 10 (hot) + 50 (cold) bytes to the
scheduler code per optimized marker would require 68 markers to occupy a
whole 4kB TLB entry. Statistically, we could suppose that adding less
than 34 markers to the scheduler should not use any supplementary TLB
entry. Adding 3 markers is therefore very unlikely to increase the TLB
impact. Given we have about 1024 TLB entries, adding 1/25th of a TLB
entry to the cache hot kernel instructions should not matter much,
especially since it might be absorbed by alignment.
And since the kernel core code is placed in "Huge TLB pages" on many
architectures nowadays, I really don't think the impact of a few bytes
out of 4MB is significant.
I therefore think that looking only at code size is misleading when
considering the cache impact of markers, since they have been designed
to put the bytes as far away as possible from cache-hot memory.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists