[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120710082104.GA11187@gmail.com>
Date: Tue, 10 Jul 2012 10:21:04 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, hpa@...or.com,
eranian@...gle.com, linux-kernel@...r.kernel.org,
fweisbec@...il.com, akpm@...ux-foundation.org, tglx@...utronix.de,
linux-tip-commits@...r.kernel.org,
Robert Richter <robert.richter@....com>
Subject: Re: [tip:perf/core] perf/x86: Fix USER/KERNEL tagging of samples
* Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> On Mon, 2012-07-09 at 20:41 +0200, Ingo Molnar wrote:
> > > +static unsigned long get_segment_base(unsigned int segment)
> > > +{
> > > + struct desc_struct *desc;
> > > + int idx = segment >> 3;
> > > +
> > > + if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> > > + if (idx > LDT_ENTRIES)
> > > + return 0;
> > > +
> > > + desc = current->active_mm->context.ldt;
> > > + } else {
> > > + if (idx > GDT_ENTRIES)
> > > + return 0;
> > > +
> > > + desc = __this_cpu_ptr(&gdt_page.gdt[0]);
> > > + }
> > > +
> > > + return get_desc_base(desc + idx);
> >
> > Shouldn't idx be checked against active_mm->context.ldt.size,
> > not LDT_ENTRIES (which is really just an upper limit)?
>
> Ah indeed, fixed that.
Another boundary condition would be when we intentionally
twiddle the GDT: such as during suspend or during BIOS upcalls.
Can we then get a PMU interrupt? If yes then this will probably
result in garbage:
> > > + desc = __this_cpu_ptr(&gdt_page.gdt[0]);
it won't outright crash, we don't ever deallocate our GDT - but
it will return a garbage RIP.
Then there's also all the Xen craziness with segments ...
Both ought to be rare an uninteresting - but then again,
segmented execution is already rare and uninteresting to begin
with.
So, instead of trying to discover all these weird x86 cases -
with little to no testing done after that - I thought that it
might be more future proof to just handle the cases we are
explicitly interested in: flat code, and pounce in some well
defined way in all the other situations by returning the RIP to
an empty __X86_LEGACY_SEGMENTED_CODE() symbol.
That way we will at least give *some* useful information to the
poor segmented code user, if the profile says:
21.32% [kernel] [k] __X86_LEGACY_SEGMENTED_CODE
11.01% [kernel] [k] kallsyms_expand_symbol
8.29% [kernel] [k] vsnprintf
7.37% libc-2.15.so [.] __strcmp_sse42
6.93% perf [.] symbol_filter
4.20% perf [.] kallsyms__parse
3.92% [kernel] [k] format_decode
3.62% [kernel] [k] string.isra.4
3.59% [kernel] [k] memcpy
3.11% [kernel] [k] strnlen
then the user at least knows that there's 21% of overhead in
some sort of segmented x86 code. Or if they *really* want to
resolve that, they can take your patch and add symbol decoding
to user-space and test it all.
KISS and such.
Linus?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists