[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1ad8874b-e6fc-9cb2-8dbb-7de6139e6c4a@codeaurora.org>
Date: Tue, 24 Sep 2019 20:00:47 +0530
From: Mukesh Ojha <mojha@...eaurora.org>
To: Jiri Olsa <jolsa@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Raghavendra Rao Ananta <rananta@...eaurora.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH V5 1/1] perf: event preserve and create across cpu hotplug
On 8/12/2019 4:12 PM, Jiri Olsa wrote:
> On Fri, Aug 02, 2019 at 12:16:53AM +0530, Mukesh Ojha wrote:
>> Perf framework doesn't allow preserving CPU events across
>> CPU hotplugs. The events are scheduled out as and when the
>> CPU walks offline. Moreover, the framework also doesn't
>> allow the clients to create events on an offline CPU. As
>> a result, the clients have to keep on monitoring the CPU
>> state until it comes back online.
>>
>> Therefore, introducing the perf framework to support creation
>> and preserving of (CPU) events for offline CPUs. Through
>> this, the CPU's online state would be transparent to the
>> client and it not have to worry about monitoring the CPU's
>> state. Success would be returned to the client even while
>> creating the event on an offline CPU. If during the lifetime
>> of the event the CPU walks offline, the event would be
>> preserved and would continue to count as soon as (and if) the
>> CPU comes back online.
>>
>> Co-authored-by: Peter Zijlstra <peterz@...radead.org>
>> Signed-off-by: Raghavendra Rao Ananta <rananta@...eaurora.org>
>> Signed-off-by: Mukesh Ojha <mojha@...eaurora.org>
>> Cc: Peter Zijlstra <peterz@...radead.org>
>> Cc: Ingo Molnar <mingo@...hat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@...nel.org>
>> Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
>> Cc: Jiri Olsa <jolsa@...hat.com>
>> Cc: Alexei Starovoitov <ast@...nel.org>
>> ---
>> Change in V5:
>> =============
>> - Rebased it.
> note that we might need to change how we store cpu topology,
> now that it can change during the sampling.. like below it's
> the comparison of header data with and without cpu 1
>
> I think some of the report code checks on topology or caches
> and it might get confused
>
> perhaps we could watch cpu topology in record and update the
> data as we see it changing.. future TODO list ;-)
Hi Jiri,
Can we do something like below to address issue related to header
update while perf record with offline cpus.
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1432,7 +1432,7 @@ static int __cmd_record(struct record *rec, int
argc, const char **argv)
opts->no_bpf_event = true;
}
- err = record__synthesize(rec, false);
+ err = record__synthesize(rec, true);
if (err < 0)
goto out_child;
@@ -1652,7 +1652,7 @@ static int __cmd_record(struct record *rec, int
argc, const char **argv)
} else
status = err;
- record__synthesize(rec, true);
+ record__synthesize(rec, false);
/* this will be recalculated during process_buildids() */
rec->samples = 0;
Thanks.
Mukesh
>
> perf stat is probably fine
>
> jirka
>
>
> ---
> -# nrcpus online : 39
> +# nrcpus online : 40
> # nrcpus avail : 40
> # cpudesc : Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> # cpuid : GenuineIntel,6,85,4
> ...
> # sibling sockets : 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> -# sibling sockets : 3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> +# sibling sockets : 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> # sibling dies : 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> -# sibling dies : 3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> +# sibling dies : 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> # sibling threads : 0,20
> +# sibling threads : 1,21
> # sibling threads : 2,22
> # sibling threads : 3,23
> # sibling threads : 4,24
> @@ -38,9 +39,8 @@
> # sibling threads : 17,37
> # sibling threads : 18,38
> # sibling threads : 19,39
> -# sibling threads : 21
> # CPU 0: Core ID 0, Die ID 0, Socket ID 0
> -# CPU 1: Core ID -1, Die ID -1, Socket ID -1
> +# CPU 1: Core ID 0, Die ID 0, Socket ID 1
> # CPU 2: Core ID 4, Die ID 0, Socket ID 0
> # CPU 3: Core ID 4, Die ID 0, Socket ID 1
> # CPU 4: Core ID 1, Die ID 0, Socket ID 0
> @@ -79,14 +79,16 @@
> # CPU 37: Core ID 9, Die ID 0, Socket ID 1
> # CPU 38: Core ID 10, Die ID 0, Socket ID 0
> # CPU 39: Core ID 10, Die ID 0, Socket ID 1
> -# node0 meminfo : total = 47391616 kB, free = 46536844 kB
> +# node0 meminfo : total = 47391616 kB, free = 46548348 kB
> # node0 cpu list : 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38
> -# node1 meminfo : total = 49539612 kB, free = 48908820 kB
> -# node1 cpu list : 3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> +# node1 meminfo : total = 49539612 kB, free = 48897176 kB
> +# node1 cpu list : 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39
> # pmu mappings: intel_pt = 8, uncore_cha_1 = 25, uncore_irp_3 = 49, software = 1, uncore_imc_5 = 18, uncore_m3upi_0 = 21, uncore_iio_free_running_5 = 45, uncore_irp_1 = 47, uncore_m2m_1 = 12, uncore_imc_3 = 16, uncore_cha_8 = 32, uncore_iio_free_running_3 = 43, uncore_imc_1 = 14, uncore_upi_1 = 20, power = 10, uncore_cha_6 = 30, uncore_iio_free_running_1 = 41, uncore_iio_4 = 38, uprobe = 7, cpu = 4, uncore_cha_4 = 28, uncore_iio_2 = 36, cstate_core = 53, breakpoint = 5, uncore_cha_2 = 26, uncore_irp_4 = 50, uncore_m3upi_1 = 22, uncore_iio_0 = 34, tracepoint = 2, uncore_cha_0 = 24, uncore_irp_2 = 48, cstate_pkg = 54, uncore_imc_4 = 17, uncore_cha_9 = 33, uncore_iio_free_running_4 = 44, uncore_ubox = 23, uncore_irp_0 = 46, uncore_m2m_0 = 11, uncore_imc_2 = 15, kprobe = 6, uncore_cha_7 = 31, uncore_iio_free_running_2 = 42, uncore_iio_5 = 39, uncore_imc_0 = 13, uncore_upi_0 = 19, uncore_cha_5 = 29, uncore_iio_free_running_0 = 40, uncore_pcu = 52, msr = 9, uncore_iio_3 = 37, uncore_cha_3 = 27, uncore_irp_5 = 51, uncore_iio_1 = 35
> # CPU cache info:
> # L1 Data 32K [0,20]
> # L1 Instruction 32K [0,20]
> +# L1 Data 32K [1,21]
> +# L1 Instruction 32K [1,21]
> # L1 Data 32K [2,22]
> # L1 Instruction 32K [2,22]
> # L1 Data 32K [3,23]
> @@ -123,9 +125,8 @@
> # L1 Instruction 32K [18,38]
> # L1 Data 32K [19,39]
> # L1 Instruction 32K [19,39]
> -# L1 Data 32K [21]
> -# L1 Instruction 32K [21]
> # L2 Unified 1024K [0,20]
> +# L2 Unified 1024K [1,21]
> # L2 Unified 1024K [2,22]
> # L2 Unified 1024K [3,23]
> # L2 Unified 1024K [4,24]
> @@ -144,12 +145,11 @@
> # L2 Unified 1024K [17,37]
> # L2 Unified 1024K [18,38]
> # L2 Unified 1024K [19,39]
> -# L2 Unified 1024K [21]
> # L3 Unified 14080K [0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38]
> -# L3 Unified 14080K [3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39]
> ...
Powered by blists - more mailing lists