lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 15 Oct 2020 13:35:27 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1 00/15] Introduce threaded trace streaming for basic
 perf record operation


On 14.10.2020 20:27, Ingo Molnar wrote:
> 
> * Alexey Budankov <alexey.budankov@...ux.intel.com> wrote:
> 
>>
>> Patch set provides threaded trace streaming for base perf record
>> operation. Provided streaming mode (--threads) mitigates profiling
>> data losses and resolves scalability issues of serial and asynchronous
>> (--aio) trace streaming modes on multicore server systems. The patch
>> set is based on the prototype [1], [2] and the most closely relates
>> to mode 3) "mode that creates thread for every monitored memory map".
>>
>> The threaded mode executes one-to-one mapping of trace streaming threads
>> to mapped data buffers and streaming into per-CPU trace files located
>> at data directory. The data buffers and threads are affined to NUMA
>> nodes and monitored CPUs according to system topology. --cpu option
>> can be used to specify exact CPUs to be monitored.
> 
> Yay! This should really be the default trace capture model everywhere 
> possible.
> 
> Can we do this for perf top too? It's really struggling with lots of cores.
> 
> If on a 64-core system I run just a moderately higher frequency 'perf top' 
> of 1 kHz:
> 
>   perf top -e cycles -F 1000
> 
> perf stays stuck forever in 'Collecting samples...', and I also get a lot 
> of:
> 
>   [548112.871089] Uhhuh. NMI received for unknown reason 31 on CPU 25.
>   [548112.871089] Do you have a strange power saving mode enabled?

Yes, we can. I would only prefer to do it in a separate patch set since
for me this patch set is already complex enough as a single change.
Is it ok?

I would also appreciate if you could clarify, advise or guide on the impact
of this perf top advancement or may be even provide some feedback on this feature adoption to help better justify the effort for my management. 

Gratefully,
Alexei

> 
> Thanks,
> 
> 	Ingo
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ