lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 5 May 2022 18:06:50 -0700 From: Ian Rogers <irogers@...gle.com> To: Adrian Hunter <adrian.hunter@...el.com> Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Jiri Olsa <jolsa@...hat.com>, Alexey Bayduraev <alexey.v.bayduraev@...ux.intel.com>, Namhyung Kim <namhyung@...nel.org>, Leo Yan <leo.yan@...aro.org>, linux-kernel@...r.kernel.org Subject: Re: [PATCH V1 00/23] perf intel-pt: Better support for perf record --cpu On Thu, May 5, 2022 at 9:56 AM Adrian Hunter <adrian.hunter@...el.com> wrote: > > Hi > > Here are V1 patches to support capturing Intel PT sideband events such as > mmap, task, context switch, text poke etc, on every CPU even when tracing > selected user_requested_cpus. That is, when using the perf record -C or > --cpu option. > > This is needed for: > 1. text poke: a text poke on any CPU affects all CPUs > 2. tracing user space: a user space process can migrate between CPUs so > mmap events that happen on a different CPU can be needed to decode a > user_requested_cpus CPU. > > For example: > > Trace on CPU 1: > > perf record --kcore -C 1 -e intel_pt// & > > Start a task on CPU 0: > > taskset 0x1 testprog & > > Migrate it to CPU 1: > > taskset -p 0x2 <testprog pid> > > Stop tracing: > > kill %1 > > Prior to these changes there will be errors decoding testprog > in userspace because the comm and mmap events for testprog will not > have been captured. > > There is quite a bit of preparation: > > The first patch is a small Intel PT test for system-wide side band. The > test fails before the patches are applied, passed afterwards. > > perf intel-pt: Add a test for system-wide side band [new in V1] > > The next 5 patches stop auxtrace mixing up mmap idx between evlist and > evsel. That is going to matter when > evlist->all_cpus != evlist->user_requested_cpus != evsel->cpus: > > libperf evsel: Factor out perf_evsel__ioctl() [now applied] > libperf evsel: Add perf_evsel__enable_thread() > perf evlist: Use libperf functions in evlist__enable_event_idx() > perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c > perf auxtrace: Do not mix up mmap idx > > The next 6 patches stop attempts to auxtrace mmap when it is not an > auxtrace event e.g. when mmapping the CPUs on which only sideband is > captured: > > libperf evlist: Remove ->idx() per_cpu parameter > libperf evlist: Move ->idx() into mmap_per_evsel() > libperf evlist: Add evsel as a parameter to ->idx() > perf auxtrace: Record whether an auxtrace mmap is needed > perf auxctrace: Add mmap_needed to auxtrace_mmap_params > perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter > > The next 5 patches switch to setting up dummy event maps before adding the > evsel so that the evsel is subject to map propagation, primarily to cause > addition of the evsel's CPUs to all_cpus. > > perf evlist: Factor out evlist__dummy_event() > perf evlist: Add evlist__add_system_wide_dummy() > perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke() > perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking > perf intel-pt: Track sideband system-wide when needed > > The remaining patches make more significant changes. > > First change from using user_requested_cpus to using all_cpus where necessary: > > perf tools: Allow all_cpus to be a superset of user_requested_cpus > > Secondly, mmap all per-thread and all per-cpu events: > > libperf evlist: Allow mixing per-thread and per-cpu mmaps > libperf evlist: Check nr_mmaps is correct [new in V1] > > Stop using system_wide flag for uncore because it will not work anymore: > > perf stat: Add requires_cpu flag for uncore > libperf evsel: Add comments for booleans [new in V1] > > Finally change map propagation so that system-wide events retain their cpus and > (dummy) threads: > > perf tools: Allow system-wide events to keep their own CPUs > perf tools: Allow system-wide events to keep their own threads > > > Changes in V1: > > perf intel-pt: Add a test for system-wide side band > New patch > > libperf evsel: Factor out perf_evsel__ioctl() > Dropped because it has been applied. > > libperf evsel: Add perf_evsel__enable_thread() > Rename variable i -> idx > > perf auxtrace: Do not mix up mmap idx > Rename variable cpu to cpu_map_idx > > perf tools: Allow all_cpus to be a superset of user_requested_cpus > Add Acked-by: Ian Rogers <irogers@...gle.com> > > libperf evlist: Allow mixing per-thread and per-cpu mmaps > Fix perf_evlist__nr_mmaps() calculation > > libperf evlist: Check nr_mmaps is correct > New patch > > libperf evsel: Add comments for booleans > New patch > > perf tools: Allow system-wide events to keep their own CPUs > perf tools: Allow system-wide events to keep their own threads > > > Adrian Hunter (23): > perf intel-pt: Add a test for system-wide side band > libperf evsel: Add perf_evsel__enable_thread() > perf evlist: Use libperf functions in evlist__enable_event_idx() > perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c > perf auxtrace: Do not mix up mmap idx > libperf evlist: Remove ->idx() per_cpu parameter > libperf evlist: Move ->idx() into mmap_per_evsel() > libperf evlist: Add evsel as a parameter to ->idx() > perf auxtrace: Record whether an auxtrace mmap is needed > perf auxtrace: Add mmap_needed to auxtrace_mmap_params > perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter > perf evlist: Factor out evlist__dummy_event() > perf evlist: Add evlist__add_system_wide_dummy() > perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke() > perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking > perf intel-pt: Track sideband system-wide when needed > perf tools: Allow all_cpus to be a superset of user_requested_cpus > libperf evlist: Allow mixing per-thread and per-cpu mmaps > libperf evlist: Check nr_mmaps is correct > perf stat: Add requires_cpu flag for uncore > libperf evsel: Add comments for booleans > perf tools: Allow system-wide events to keep their own CPUs > perf tools: Allow system-wide events to keep their own threads > > tools/lib/perf/evlist.c | 80 ++++++++++------------- > tools/lib/perf/evsel.c | 10 +++ > tools/lib/perf/include/internal/evlist.h | 3 +- > tools/lib/perf/include/internal/evsel.h | 9 +++ > tools/lib/perf/include/internal/mmap.h | 5 ++ > tools/lib/perf/include/perf/evsel.h | 1 + > tools/perf/arch/arm/util/cs-etm.c | 1 + > tools/perf/arch/arm64/util/arm-spe.c | 1 + > tools/perf/arch/s390/util/auxtrace.c | 1 + > tools/perf/arch/x86/util/intel-bts.c | 1 + > tools/perf/arch/x86/util/intel-pt.c | 32 ++++------ > tools/perf/builtin-record.c | 39 +++++------- > tools/perf/builtin-stat.c | 5 +- > tools/perf/tests/shell/test_intel_pt.sh | 71 +++++++++++++++++++++ > tools/perf/util/auxtrace.c | 31 +++++++-- > tools/perf/util/auxtrace.h | 8 ++- > tools/perf/util/evlist.c | 106 +++++++++++++++---------------- > tools/perf/util/evlist.h | 7 +- > tools/perf/util/evsel.c | 1 + > tools/perf/util/evsel.h | 1 + > tools/perf/util/mmap.c | 4 +- > tools/perf/util/parse-events.c | 2 +- > 22 files changed, 259 insertions(+), 160 deletions(-) > create mode 100755 tools/perf/tests/shell/test_intel_pt.sh > > > Regards > Adrian Thanks Adrian, other than a minor build breakage, and some possible function name and comment tweaks, I'm happy with the change. I did 'perf test' on my SkylakeX and things look good too. One last minor thing is the test_intel_pt.sh doesn't test --per-thread mode, this might be good to cover for CPU map propagation issues. Thanks, Ian
Powered by blists - more mailing lists