[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7d6cbc8-1f5f-d0f4-6656-f0cc8ab3a118@intel.com>
Date: Tue, 18 Apr 2023 16:36:11 +0300
From: Adrian Hunter <adrian.hunter@...el.com>
To: Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 0/5] perf: Add ioctl to emit sideband events
On 18/04/23 09:18, Adrian Hunter wrote:
> On 17/04/23 14:02, Peter Zijlstra wrote:
>> On Fri, Apr 14, 2023 at 11:22:55AM +0300, Adrian Hunter wrote:
>>> Hi
>>>
>>> Here is a stab at adding an ioctl for sideband events.
>>>
>>> This is to overcome races when reading the same information
>>> from /proc.
>>
>> What races? Are you talking about reading old state in /proc the kernel
>> delivering a sideband event for the new state, and then you writing the
>> old state out?
>>
>> Surely that's something perf tool can fix without kernel changes?
>
> Yes, and it was a bit of a brain fart not to realise that.
>
> There may still be corner cases, where different kinds of events are
> interdependent, perhaps NAMESPACES events vs MMAP events could
> have ordering issues.
>
> Putting that aside, the ioctl may be quicker than reading from
> /proc. I could get some numbers and see what people think.
>
Here's a result with a quick hack to use the ioctl but without
handling the buffer becoming full (hence the -m4M)
# ps -e | wc -l
1171
# perf.old stat -- perf.old record -o old.data --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB old.data (100 samples) ]
Performance counter stats for 'perf.old record -o old.data --namespaces -a true':
498.15 msec task-clock # 0.987 CPUs utilized
126 context-switches # 252.935 /sec
64 cpu-migrations # 128.475 /sec
4396 page-faults # 8.825 K/sec
1927096347 cycles # 3.868 GHz
4563059399 instructions # 2.37 insn per cycle
914232559 branches # 1.835 G/sec
6618052 branch-misses # 0.72% of all branches
9633787105 slots # 19.339 G/sec
4394300990 topdown-retiring # 38.8% Retiring
3693815286 topdown-bad-spec # 32.6% Bad Speculation
1692356927 topdown-fe-bound # 14.9% Frontend Bound
1544151518 topdown-be-bound # 13.6% Backend Bound
0.504636742 seconds time elapsed
0.158237000 seconds user
0.340625000 seconds sys
# perf.old stat -- perf.new record -o new.data -m4M --namespaces -a true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.095 MB new.data (103 samples) ]
Performance counter stats for 'perf.new record -o new.data -m4M --namespaces -a true':
386.61 msec task-clock # 0.988 CPUs utilized
100 context-switches # 258.658 /sec
65 cpu-migrations # 168.128 /sec
4935 page-faults # 12.765 K/sec
1495905137 cycles # 3.869 GHz
3647660473 instructions # 2.44 insn per cycle
735822370 branches # 1.903 G/sec
5765668 branch-misses # 0.78% of all branches
7477722620 slots # 19.342 G/sec
3415835954 topdown-retiring # 39.5% Retiring
2748625759 topdown-bad-spec # 31.8% Bad Speculation
1221594670 topdown-fe-bound # 14.1% Frontend Bound
1256150733 topdown-be-bound # 14.5% Backend Bound
0.391472763 seconds time elapsed
0.141207000 seconds user
0.246277000 seconds sys
# ls -lh old.data
-rw------- 1 root root 1.2M Apr 18 13:19 old.data
# ls -lh new.data
-rw------- 1 root root 1.2M Apr 18 13:19 new.data
#
Powered by blists - more mailing lists