lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 20 Dec 2011 11:47:02 +0200
From:	Avi Kivity <avi@...hat.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Robert Richter <robert.richter@....com>,
	Benjamin Block <bebl@...eta.org>,
	Hans Rosenfeld <hans.rosenfeld@....com>, hpa@...or.com,
	tglx@...utronix.de, suresh.b.siddha@...el.com, eranian@...gle.com,
	brgerst@...il.com, Andreas.Herrmann3@....com, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Benjamin Block <benjamin.block@....com>
Subject: Re: [RFC 4/5] x86, perf: implements lwp-perf-integration (rc1)

On 12/20/2011 11:15 AM, Ingo Molnar wrote:
> The LWPCB and the LWP ring-buffer are really just an extension 
> of that concept: per task buffers which are ring 3 visible. 

No, it's worse.  They are ring 3 writeable, and ring 3 configurable.

> Note that user-space does not actually have to know about any of 
> these LWP addresses (but can access them if it wants to - no 
> strong feelings about that) - in the correctly implemented model 
> it's fully kernel managed.

btw, that means that the intended use case - self-monitoring with no
kernel support - cannot be done.  That's not an issue per se, it depends
on the cost of the kernel support and whether any information is lost
(like the records inserted by the explicit LWP instructions).

> In fact the PEBS case had one more complication: there's the BTS 
> branch-tracing feature which we support as well, and which 
> overlaps PEBS use of the DS.

(semi-related: both DS and LWP cannot be used by kvm to monitor a guest
from the host, since they both use virtual addresses)

> All these PMU hardware limitations can be supported, as long as 
> the instrumentation *capability* adds value to the system in one 
> way or another.
>
> > >    System-wide profiling is a small additional variant of 
> > >    this: creating such a user-vmalloc() area for all tasks 
> > >    in the system so that the PMU code has them ready in the 
> > >    context-switch code.
> > 
> > What about security?  Do we want to allow any userspace 
> > process to mess up the buffers?  It can even reprogram the LWP 
> > block, so you're counting different things, or at higher 
> > frequencies, or into other processes ordinary vmas?
>
> In most usecases it's the application messing up its own 
> profiling - don't do that if it hurts.

Not in the system profiling case (not that anything truly bad will
happen, but it's not nice to have the kernel supplying data it can't trust).

> I'd argue that future LWP versions should allow kernel-protected 
> LWP pages, as long as the LWPCB is privileged as well as well. 
> That would be useful for another purpose as well: LWP could be 
> allowed to sample kernel-space execution as well, an obviously 
> useful feature that was left out from LWP for barely explicable 
> reasons.
>
> Granted, LWP was mis-designed to quite a degree, those AMD chip 
> engineers should have talked to people who understand how modern 
> PMU abstractions are added to the OS kernel properly. But this 
> mis-design does not keep us from utilizing this piece of 
> hardware intelligently. PEBS/DS/BTS wasnt a beauty either.

LWP was clearly designed for userspace jits, and clearly designed to
work with minimal kernel support.  For this use case, it wasn't
mis-designed.  Maybe they designed for the wrong requirements and
constraints (for example, it is much harder to get PMU abstractions into
Windows than into Linux), but within those requirements, it appears to
be well done.

I'm worried that shoe-horning LWP into the system profiling role will
result in poor support for that role, *and* prevent its use in the
intended use case.

> > You could rebuild the LWP block on every context switch I 
> > guess, but you need to prevent access to other cpus' LWP 
> > blocks (since they may be running other processes).  I think 
> > this calls for per-cpu cr3, even for threads in the same 
> > process.
>
> Why would we want to rebuild the LWPCB? Just keep one per task 
> and do a lightweight switch to it during switch_to() - like we 
> do it with the PEBS hardware-ring-buffer. It can be in the same 
> single block of memory with the ring-buffer itself. (PEBS has 
> similar characteristics)

If it's in globally visible memory, the user can reprogram the LWP from
another thread to thrash ordinary VMAs.  It has to be process local (at
which point, you can just use do_mmap() to allocate it).

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists