linux-kernel - Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111220100916.GA20788@elte.hu>
Date:	Tue, 20 Dec 2011 11:09:17 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Avi Kivity <avi@...hat.com>
Cc:	Robert Richter <robert.richter@....com>,
	Benjamin Block <bebl@...eta.org>,
	Hans Rosenfeld <hans.rosenfeld@....com>, hpa@...or.com,
	tglx@...utronix.de, suresh.b.siddha@...el.com, eranian@...gle.com,
	brgerst@...il.com, Andreas.Herrmann3@....com, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Benjamin Block <benjamin.block@....com>
Subject: Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1)

* Avi Kivity <avi@...hat.com> wrote:

> On 12/20/2011 11:15 AM, Ingo Molnar wrote:
>
> > The LWPCB and the LWP ring-buffer are really just an 
> > extension of that concept: per task buffers which are ring 3 
> > visible.
> 
> No, it's worse.  They are ring 3 writeable, and ring 3 
> configurable.

Avi, i know that very well.

> > Note that user-space does not actually have to know about 
> > any of these LWP addresses (but can access them if it wants 
> > to - no strong feelings about that) - in the correctly 
> > implemented model it's fully kernel managed.
> 
> btw, that means that the intended use case - self-monitoring 
> with no kernel support - cannot be done. [...]

Arguably many years ago the hardware was designed for brain-dead 
instrumentation abstractions.

Note that as i said user-space *can* acccess the area if it 
thinks it can do it better than the kernel (and we could export 
that information in a well defined way - we could do the same 
for PEBS as well) - i have no particular strong feelings about 
allowing that other than i think it's an obviously inferior 
model - *as long* as proper, generic, usable support is added.

>From my perspective there's really just one realistic option to 
accept this feature: if it's properly fit into existing, modern 
instrumentation abstractions. I made that abundantly clear in my 
feedback so far.

It can obviously be done, alongside the suggestions i've given.

That was the condition for Intel PEBS/DS/BTS support as well - 
which is hardware that has at least as many brain-dead 
constraints and roadblocks as LWP.

> > > You could rebuild the LWP block on every context switch I 
> > > guess, but you need to prevent access to other cpus' LWP 
> > > blocks (since they may be running other processes).  I 
> > > think this calls for per-cpu cr3, even for threads in the 
> > > same process.
> >
> > Why would we want to rebuild the LWPCB? Just keep one per 
> > task and do a lightweight switch to it during switch_to() - 
> > like we do it with the PEBS hardware-ring-buffer. It can be 
> > in the same single block of memory with the ring-buffer 
> > itself. (PEBS has similar characteristics)
> 
> If it's in globally visible memory, the user can reprogram the 
> LWP from another thread to thrash ordinary VMAs. [...]

User-space can smash it and make it not profile or profile the 
wrong thing or into the wrong buffer - but LWP itself runs with 
ring3 privileges so it won't do anything the user couldnt do 
already.

Lack of protection against self-misconfiguration-damage is a 
benign hardware mis-feature - something for LWP v2 to specify i 
guess.

But i don't want to reject this feature based on this 
mis-feature alone - it's a pretty harmless limitation and the 
precise, skid-less profiling that LWP offers is obviously 
useful.

> [...]  It has to be process local (at which point, you can 
> just use do_mmap() to allocate it).

get_unmapped_area() + install_special_mapping() is probably 
better, but yeah.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/