linux-kernel - Re: [tip:perf/core] perf/x86: Fix USER/KERNEL tagging of samples

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120710082104.GA11187@gmail.com>
Date:	Tue, 10 Jul 2012 10:21:04 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>, hpa@...or.com,
	eranian@...gle.com, linux-kernel@...r.kernel.org,
	fweisbec@...il.com, akpm@...ux-foundation.org, tglx@...utronix.de,
	linux-tip-commits@...r.kernel.org,
	Robert Richter <robert.richter@....com>
Subject: Re: [tip:perf/core] perf/x86: Fix USER/KERNEL tagging of samples

* Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:

> On Mon, 2012-07-09 at 20:41 +0200, Ingo Molnar wrote:
> > > +static unsigned long get_segment_base(unsigned int segment)
> > > +{
> > > +     struct desc_struct *desc;
> > > +     int idx = segment >> 3;
> > > +
> > > +     if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> > > +             if (idx > LDT_ENTRIES)
> > > +                     return 0;
> > > +
> > > +             desc = current->active_mm->context.ldt;
> > > +     } else {
> > > +             if (idx > GDT_ENTRIES)
> > > +                     return 0;
> > > +
> > > +             desc = __this_cpu_ptr(&gdt_page.gdt[0]);
> > > +     }
> > > +
> > > +     return get_desc_base(desc + idx);
> > 
> > Shouldn't idx be checked against active_mm->context.ldt.size, 
> > not LDT_ENTRIES (which is really just an upper limit)?
> 
> Ah indeed, fixed that.

Another boundary condition would be when we intentionally 
twiddle the GDT: such as during suspend or during BIOS upcalls. 
Can we then get a PMU interrupt? If yes then this will probably 
result in garbage:

> > > +             desc = __this_cpu_ptr(&gdt_page.gdt[0]);

it won't outright crash, we don't ever deallocate our GDT - but 
it will return a garbage RIP.

Then there's also all the Xen craziness with segments ...

Both ought to be rare an uninteresting - but then again, 
segmented execution is already rare and uninteresting to begin 
with.

So, instead of trying to discover all these weird x86 cases - 
with little to no testing done after that - I thought that it 
might be more future proof to just handle the cases we are 
explicitly interested in: flat code, and pounce in some well 
defined way in all the other situations by returning the RIP to 
an empty __X86_LEGACY_SEGMENTED_CODE() symbol.

That way we will at least give *some* useful information to the 
poor segmented code user, if the profile says:

    21.32%  [kernel]      [k] __X86_LEGACY_SEGMENTED_CODE
    11.01%  [kernel]      [k] kallsyms_expand_symbol     
     8.29%  [kernel]      [k] vsnprintf                  
     7.37%  libc-2.15.so  [.] __strcmp_sse42             
     6.93%  perf          [.] symbol_filter              
     4.20%  perf          [.] kallsyms__parse            
     3.92%  [kernel]      [k] format_decode              
     3.62%  [kernel]      [k] string.isra.4              
     3.59%  [kernel]      [k] memcpy                     
     3.11%  [kernel]      [k] strnlen                    

then the user at least knows that there's 21% of overhead in 
some sort of segmented x86 code. Or if they *really* want to 
resolve that, they can take your patch and add symbol decoding 
to user-space and test it all.

KISS and such.

Linus?

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/