lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Sep 2022 07:52:03 +0300
From:   Adrian Hunter <adrian.hunter@...el.com>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>, Ian Rogers <irogers@...gle.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-perf-users <linux-perf-users@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for
 multi-threaded targets

On 6/09/22 20:50, Namhyung Kim wrote:
> On Mon, Sep 5, 2022 at 4:42 AM Adrian Hunter <adrian.hunter@...el.com> wrote:
>>
>> The offending commit removed mmap_per_thread(), which did not consider
>> the different set-output rules for per-thread mmaps i.e. in the per-thread
>> case set-output is used for file descriptors of the same thread not the
>> same cpu.
>>
>> This was not immediately noticed because it only happens with
>> multi-threaded targets and we do not have a test for that yet.
> 
> Yeah, this is unfortunate.  I feel like I need to spend some time on it.
> 
>>
>> Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
>> events i.e. to continue to allow the mixing of per-thread and per-cpu
>> mmaps.
>>
>> Debug messages (with -vv) show the file descriptors that are opened with
>> sys_perf_event_open. New debug messages are added (needs -vvv) that show
>> also which file descriptors are mmapped and which are redirected with
>> set-output.
>>
>> In the per-cpu case (cpu != -1) file descriptors for the same CPU are
>> set-output to the first file descriptor for that CPU.
>>
>> In the per-thread case (cpu == -1) file descriptors for the same thread are
>> set-output to the first file descriptor for that thread.
>>
>> Example (process 17489 has 2 threads):
>>
>>  Before (but with new debug prints):
>>
>>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>>    <SNIP>
>>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>>    <SNIP>
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 0: set output fd 6 -> 5
>>    failed to mmap with 22 (Invalid argument)
>>
>>  After:
>>
>>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>>    <SNIP>
>>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>>    <SNIP>
>>    libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 1: mmapping fd 6
>>    <SNIP>
>>    [ perf record: Woken up 2 times to write data ]
>>    [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]
> 
> It'd be nice if the example had 2 events so that it could check the
> set-output rule actually worked.

Here is an Intel PT example

$ perf record -vvv --per-thread -p 1521 -e intel_pt//
Using CPUID GenuineIntel-6-8C-1
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
DEBUGINFOD_URLS=
nr_cblocks: 0
affinity: SYS
mmap flush: 1
comp level: 0
------------------------------------------------------------
perf_event_attr:
  type                             9
  size                             128
  config                           0x300e601
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  sample_id_all                    1
  exclude_guest                    1
  aux_watermark                    1048576
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 5
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 6
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  exclude_kernel                   1
  exclude_hv                       1
  mmap                             1
  comm                             1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  bpf_event                        1
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 7
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 8
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|IDENTIFIER
  read_format                      ID
  exclude_kernel                   1
  exclude_hv                       1
  sample_id_all                    1
  exclude_guest                    1
  ksymbol                          1
  text_poke                        1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 9
sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 10
sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 11
sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 12
sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 13
sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 14
sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 15
sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 16
mmap size 528384B
AUX area mmap length 4194304
libperf: mmap_per_thread: nr cpu values (may include -1) 9 nr threads 2
libperf: idx 0: mmapping fd 5
libperf: idx 0: set output fd 7 -> 5
libperf: idx 1: mmapping fd 6
libperf: idx 1: set output fd 8 -> 6
libperf: idx 2: mmapping fd 9
libperf: idx 3: mmapping fd 10
libperf: idx 4: mmapping fd 11
libperf: idx 5: mmapping fd 12
libperf: idx 6: mmapping fd 13
libperf: idx 7: mmapping fd 14
libperf: idx 8: mmapping fd 15
libperf: idx 9: mmapping fd 16
Control descriptor is not initialized
thread_data[0x5572a73ec2f0]: nr_mmaps=10, maps=0x5572a73ec380, ow_maps=(nil)
thread_data[0x5572a73ec2f0]: cpu-1: maps[0] -> mmap[0]
thread_data[0x5572a73ec2f0]: cpu0: maps[1] -> mmap[1]
thread_data[0x5572a73ec2f0]: cpu1: maps[2] -> mmap[2]
thread_data[0x5572a73ec2f0]: cpu2: maps[3] -> mmap[3]
thread_data[0x5572a73ec2f0]: cpu3: maps[4] -> mmap[4]
thread_data[0x5572a73ec2f0]: cpu4: maps[5] -> mmap[5]
thread_data[0x5572a73ec2f0]: cpu5: maps[6] -> mmap[6]
thread_data[0x5572a73ec2f0]: cpu6: maps[7] -> mmap[7]
thread_data[0x5572a73ec2f0]: cpu7: maps[8] -> mmap[8]
thread_data[0x5572a73ec2f0]: cpu-1: maps[9] -> mmap[9]
thread_data[0x5572a73ec2f0]: pollfd[0] <- event_fd=5
thread_data[0x5572a73ec2f0]: pollfd[1] <- event_fd=7
thread_data[0x5572a73ec2f0]: pollfd[2] <- event_fd=6
thread_data[0x5572a73ec2f0]: pollfd[3] <- event_fd=8
thread_data[0x5572a73ec2f0]: pollfd[4] <- non_perf_event fd=4
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  watermark                        1
  sample_id_all                    1
  bpf_event                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 17
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 18
mmap size 528384B
libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
libperf: idx 0: mmapping fd 17
libperf: idx 1: mmapping fd 18
Synthesizing TSC conversion information
Synthesizing id index
Synthesizing auxtrace information
auxtrace idx 1 old 0 head 0x2c0 diff 0x2c0
auxtrace idx 0 old 0 head 0x2060 diff 0x2060
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.029 MB perf.data ]




> 
> Thanks,
> Namhyung
> 
>>
>> Per-cpu example (process 20341 has 2 threads, same as above):
>>
>>    $ perf record --no-bpf-event -vvv -p 20341
>>    <SNIP>
>>    sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
>>    sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
>>    sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
>>    sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
>>    sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
>>    sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
>>    sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
>>    sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
>>    sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
>>    sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
>>    sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
>>    sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
>>    sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
>>    sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
>>    sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
>>    <SNIP>
>>    libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 0: set output fd 6 -> 5
>>    libperf: idx 1: mmapping fd 7
>>    libperf: idx 1: set output fd 8 -> 7
>>    libperf: idx 2: mmapping fd 9
>>    libperf: idx 2: set output fd 10 -> 9
>>    libperf: idx 3: mmapping fd 11
>>    libperf: idx 3: set output fd 12 -> 11
>>    libperf: idx 4: mmapping fd 13
>>    libperf: idx 4: set output fd 14 -> 13
>>    libperf: idx 5: mmapping fd 15
>>    libperf: idx 5: set output fd 16 -> 15
>>    libperf: idx 6: mmapping fd 17
>>    libperf: idx 6: set output fd 18 -> 17
>>    libperf: idx 7: mmapping fd 19
>>    libperf: idx 7: set output fd 20 -> 19
>>    <SNIP>
>>    [ perf record: Woken up 7 times to write data ]
>>    [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
>>
>> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
>> Signed-off-by: Adrian Hunter <adrian.hunter@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ