lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMou1-12ggxAuaPQ8VG7Gf5BZM6CVj6773HRB1QJXagB-okuGg@mail.gmail.com>
Date:	Mon, 3 Nov 2014 21:47:17 +0000
From:	Robert Bragg <robert@...bynine.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, Paul Mackerras <paulus@...ba.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Daniel Vetter <daniel.vetter@...ll.ch>,
	Chris Wilson <chris@...is-wilson.co.uk>,
	Rob Clark <robdclark@...il.com>,
	Samuel Pitoiset <samuel.pitoiset@...il.com>,
	Ben Skeggs <bskeggs@...hat.com>
Subject: Re: [RFC PATCH 0/3] Expose gpu counters via perf pmu driver

On Thu, Oct 30, 2014 at 7:08 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Oct 22, 2014 at 04:28:48PM +0100, Robert Bragg wrote:
>> Our desired permission model seems consistent with perf's current model
>> whereby you would need privileges if you want to profile across all gpu
>> contexts but not need special permissions to profile your own context.
>>
>> The awkward part is that it doesn't make sense for us to have userspace
>> open a perf event with a specific pid as the way to avoid needing root
>> permissions because a side effect of doing this is that the events will
>> be dynamically added/deleted so as to only monitor while that process is
>> scheduled and that's not really meaningful when we're monitoring the
>> gpu.
>
> There is precedent in PERF_FLAG_PID_CGROUP to replace the pid argument
> with a fd to your object.

Ah ok, interesting.

>
> And do I take it right that if you're able/allowed/etc.. to open/have
> the fd to the GPU/DRM/DRI whatever context you have the right
> credentials to also observe these counters?

Right and in particular since we want to allow OpenGL clients to be
able the profile their own gpu context with out any special privileges
my current pmu driver accepts a device file descriptor via config1 + a
context id via attr->config, both for checking credentials and
uniquely identifying which context should be profiled. (A single
client can open multiple contexts via one drm fd)

That said though; when running as root it is not currently a
requirement to pass any fd when configuring an event to profile across
all gpu contexts. I'm just mentioning this because although I think it
should be ok for us to use an fd to determine credentials and help
specify a gpu context, an fd might not be necessary for system wide
profiling cases.

>
>> Conceptually I suppose we want to be able to open an event that's not
>> associated with any cpu or process, but to keep things simple and fit
>> with perf's current design, the pmu I have a.t.m expects an event to be
>> opened for a specific cpu and unspecified process.
>
> There are no actual scheduling ramifications right? Let me ponder his
> for a little while more..

Ok, I can't say I'm familiar enough with the core perf infrastructure
to entirely sure about this.

I recall looking at how some of the uncore perf drivers were working
and it looked like they had a similar issue where conceptually the pmu
doesn't belong to a specific cpu and so the id would internally get
mapped to some package state, shared by multiple cpus.

My understanding had been that being associated with a specific cpu
did have the side effect that most of the pmu methods for that event
would then be invoked on that cpu through inter-process interrupts. At
one point that had seemed slightly problematic because there weren't
many places within my pmu driver where I could assume I was in process
context and could sleep. This was a problem with an earlier version
because the way I read registers had a slim chance of needing to sleep
waiting for the gpu to come out of RC6, but isn't a problem any more.

One thing that does come to mind here though is that I am overloading
pmu->read() as a mechanism for userspace to trigger a flush of all
counter snapshots currently in the gpu circular buffer to userspace as
perf events. Perhaps it would be best if that work (which might be
relatively costly at times) were done in the context of the process
issuing the flush(), instead of under an IPI (assuming that has some
effect on scheduler accounting).

Regards,
- Robert
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ