[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z7WKtI6lwvAZPb1y@google.com>
Date: Tue, 18 Feb 2025 23:39:32 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
James Clark <james.clark@...aro.org>,
Weilin Wang <weilin.wang@...el.com>,
Jean-Philippe Romain <jean-philippe.romain@...s.st.com>,
Junhao He <hejunhao3@...wei.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v1] perf drm_pmu: Add a tool like PMU to expose DRM
information
Hi Ian,
On Mon, Feb 10, 2025 at 11:17:27PM -0800, Ian Rogers wrote:
> DRM clients expose information through usage stats as documented in
> Documentation/gpu/drm-usage-stats.rst (available online at
> https://docs.kernel.org/gpu/drm-usage-stats.html). Add a tool like
> PMU, similar to the hwmon PMU, that exposes DRM information.
Probably better to put this link in a file header comment.
> For example on a tigerlake laptop:
> ```
> $ perf list drm
>
> List of pre-defined events (to be used in -e or -M):
>
> drm:
> drm-active-stolen-system0
> [Total memory active in one or more engines. Unit: drm_i915]
> drm-active-system0
> [Total memory active in one or more engines. Unit: drm_i915]
> drm-engine-capacity-video
> [Engine capacity. Unit: drm_i915]
> drm-engine-copy
> [Utilization in ns. Unit: drm_i915]
> drm-engine-render
> [Utilization in ns. Unit: drm_i915]
> drm-engine-video
> [Utilization in ns. Unit: drm_i915]
> drm-engine-video-enhance
> [Utilization in ns. Unit: drm_i915]
> drm-purgeable-stolen-system0
> [Size of resident and purgeable memory bufers. Unit: drm_i915]
> drm-purgeable-system0
> [Size of resident and purgeable memory bufers. Unit: drm_i915]
> drm-resident-stolen-system0
> [Size of resident memory bufers. Unit: drm_i915]
> drm-resident-system0
> [Size of resident memory bufers. Unit: drm_i915]
> drm-shared-stolen-system0
> [Size of shared memory bufers. Unit: drm_i915]
> drm-shared-system0
> [Size of shared memory bufers. Unit: drm_i915]
> drm-total-stolen-system0
> [Size of shared and private memory. Unit: drm_i915]
> drm-total-system0
> [Size of shared and private memory. Unit: drm_i915]
> ```
>
> System wide data can be gathered:
> ```
> $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0
> 1.000904910,0,bytes,drm-active-stolen-system0,1,100.00,,
> 1.000904910,0,bytes,drm-active-system0,1,100.00,,
> 1.000904910,36,capacity,drm-engine-capacity-video,1,100.00,,
> 1.000904910,0,ns,drm-engine-copy,1,100.00,,
> 1.000904910,1472970566175,ns,drm-engine-render,1,100.00,,
> 1.000904910,0,ns,drm-engine-video,1,100.00,,
> 1.000904910,0,ns,drm-engine-video-enhance,1,100.00,,
> 1.000904910,0,bytes,drm-purgeable-stolen-system0,1,100.00,,
> 1.000904910,38199296,bytes,drm-purgeable-system0,1,100.00,,
> 1.000904910,0,bytes,drm-resident-stolen-system0,1,100.00,,
> 1.000904910,4643196928,bytes,drm-resident-system0,1,100.00,,
> 1.000904910,0,bytes,drm-shared-stolen-system0,1,100.00,,
> 1.000904910,1886871552,bytes,drm-shared-system0,1,100.00,,
> 1.000904910,0,bytes,drm-total-stolen-system0,1,100.00,,
> 1.000904910,4643196928,bytes,drm-total-system0,1,100.00,,
> 2.264426839,0,bytes,drm-active-stolen-system0,1,100.00,,
> ```
>
> Or for a particular process:
> ```
> $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 -p 200027
> 1.001040274,0,bytes,drm-active-stolen-system0,6,100.00,,
> 1.001040274,0,bytes,drm-active-system0,6,100.00,,
> 1.001040274,12,capacity,drm-engine-capacity-video,6,100.00,,
> 1.001040274,0,ns,drm-engine-copy,6,100.00,,
> 1.001040274,1542300,ns,drm-engine-render,6,100.00,,
> 1.001040274,0,ns,drm-engine-video,6,100.00,,
> 1.001040274,0,ns,drm-engine-video-enhance,6,100.00,,
> 1.001040274,0,bytes,drm-purgeable-stolen-system0,6,100.00,,
> 1.001040274,13516800,bytes,drm-purgeable-system0,6,100.00,,
> 1.001040274,0,bytes,drm-resident-stolen-system0,6,100.00,,
> 1.001040274,27746304,bytes,drm-resident-system0,6,100.00,,
> 1.001040274,0,bytes,drm-shared-stolen-system0,6,100.00,,
> 1.001040274,0,bytes,drm-shared-system0,6,100.00,,
> 1.001040274,0,bytes,drm-total-stolen-system0,6,100.00,,
> 1.001040274,27746304,bytes,drm-total-system0,6,100.00,,
> 2.016629075,0,bytes,drm-active-stolen-system0,6,100.00,,
> ```
I've tested it briefly.
$ ./perf stat -e drm-engine-render,drm-total-system0 -a sleep 1
Performance counter stats for 'system wide':
2,869,492,628,815 ns drm-engine-render
2,777,497,600 bytes drm-total-system0
1.004182447 seconds time elapsed
It seems the numbers are quite big.
$ ./perf stat -e drm-engine-render,drm-total-system0 -aA sleep 1
Performance counter stats for 'system wide':
CPU0 2,870,871,280,238 ns drm_i915/drm-engine-render/
CPU1 <not counted> ns drm_i915/drm-engine-render/
CPU2 <not counted> ns drm_i915/drm-engine-render/
CPU3 <not counted> ns drm_i915/drm-engine-render/
CPU0 2,750,578,688 bytes drm_i915/drm-total-system0/
CPU1 <not counted> bytes drm_i915/drm-total-system0/
CPU2 <not counted> bytes drm_i915/drm-total-system0/
CPU3 <not counted> bytes drm_i915/drm-total-system0/
1.001678363 seconds time elapsed
Ok, it only reads from the CPU0. But I guess there are some
duplications. Have you checked drm-client-id?
$ ./perf stat -e drm-engine-render -a --per-thread sleep 1
Performance counter stats for 'system wide':
Xorg-6900 852,545,872,646 ns drm-engine-render
Xorg:disk$0-6901 852,545,872,646 ns drm-engine-render
Xorg:sh0-6902 852,545,872,646 ns drm-engine-render
Xorg:traceq0-6904 852,545,872,646 ns drm-engine-render
Xorg:gdrv0-6906 852,545,872,646 ns drm-engine-render
InputThread-6946 852,545,872,646 ns drm-engine-render
gnome-shell-7127 808,521,145,191 ns drm-engine-render
pool-spawner-7146 808,521,145,191 ns drm-engine-render
gmain-7147 808,521,145,191 ns drm-engine-render
gdbus-7149 808,521,145,191 ns drm-engine-render
dconf worker-7150 808,521,145,191 ns drm-engine-render
JS Helper-7151 808,521,145,191 ns drm-engine-render
JS Helper-7152 808,521,145,191 ns drm-engine-render
...
Trying record..
$ ./perf record -e drm-engine-render sleep 1
failed to mmap with 9 (Bad file descriptor)
I think you can fail evsel__open() if attr.sample_period != 0.
>
> As with the hwmon PMU, high numbered PMU types are used to encode
> multiple possible "DRM" PMUs. The appropriate fdinfo is found by
> scanning /proc and filtering which fdinfos to read with stat. To avoid
> some unneeding scanning, events not starting with "drm-" are ignored.
It's sad that it should scan /proc whenever it reads the event but I
don't think we have other options.
> The patch builds on commit 57e13264dcea ("perf pmus:
> Restructure pmu_read_sysfs to scan fewer PMUs") so that only if full
> wild carding is being done, that the drm PMUs will be read.
Can you please add a test case?
Thanks,
Namhyung
Powered by blists - more mailing lists