lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 22 Jan 2019 09:59:49 +0100 From: Stefan Agner <stefan@...er.ch> To: Mark Rutland <mark.rutland@....com> Cc: will.deacon@....com, shawnguo@...nel.org, l.stach@...gutronix.de, kernel@...gutronix.de, fabio.estevam@....com, linux-imx@....com, linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH] drivers/perf: handle multiple CPUs with single interrupts On 21.01.2019 18:23, Mark Rutland wrote: > Hi Stefan, > > On Mon, Jan 21, 2019 at 03:41:11PM +0100, Stefan Agner wrote: >> Currently, if only a single interrupt is available, the code assigns >> this single interrupt to the first CPU. All other CPUs are left >> unsupported. This allows to use perf events only on processes using >> the first CPU. This is not obvious to the user. >> >> Instead, disable interrupts but support all CPUs. This allows to use >> the PMU on all CPUs for all events other than sampling events which do >> require interrupt support. > > Even for non-sampling events we use the overflow interrupt to ensure > that we don't lose count across overflows, and without that interrupt, > we cannot guarantee that the results are accurate. > > For example: > > Program counter to 0 > Start program > < counter overflows, no interrupt > > < counter overflows, no interrupt > > Stop program > Counter reads as 5 > > > ... which we cannot distinguish from: > > Program counter to 0 > Start program > < counter overflows, no interrupt > > < counter overflows, no interrupt > > < counter overflows, no interrupt > > < counter overflows, no interrupt > > < counter overflows, no interrupt > > < counter overflows, no interrupt > > Stop program > Counter reads as 5 > > ... and thus cannot provide any reasonable confidence in results. Oh ok I see, this is not what we want at all! Currently we register that one IRQ for CPU0. So what happens today is: Program counter to 0 Start program < program scheduled on CPU0 > < counter overflows, interrupt > < program scheduled on CPU1 > < counter overflows, no interrupt > < counter overflows, no interrupt > Stop program Counter reads as 5 Which is off too... I could also reproduce this by manually moving the process between CPU0/1... We probably should not probe the driver at all then? It seems rather unwise to provide users PMU data which might be plain wrong. > > In theory, we could use a hrtimer to periodically refresh the events to > prevent this, but this would need to be set at some very high frequency > to account for the fastest potential counter, and would significantly > degrade performance. > > I don't think that's going to be reliable, and given that, I don't think > that we can support muxed IRQs in this way. Is it possible to get the overflow interrupts as well as read the PMU counters for another CPU? If not, I assume the interrupt bounce mechanism from ux500 is the only way? -- Stefan > > Thanks, > Mark. > >> >> Signed-off-by: Stefan Agner <stefan@...er.ch> >> --- >> This has been observed and tested on a i.MX 6DualLite, but is probably >> valid for i.MX 6Quad as well. >> >> It seems that ux500 once had support for single IRQ on a SMP system, >> however this got removed with: >> Commit 2b05f6ae1ee5 ("ARM: ux500: remove PMU IRQ bouncer") >> >> I noticed that with this patch I get an error when trying to use perf stat: >> # perf top >> Error: >> cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' >> >> Without this patch perf top seems to work, but it seems not to use any >> sampling events (?): >> # perf top >> PerfTop: 7215 irqs/sec kernel:100.0% exact: 0.0% [4000Hz cpu-clock:pppH], (all, 2 CPUs) >> .... >> >> Also starting perf top and explicitly selecting cpu-clock seems to work >> and show the same data as before this change. >> # perf top -e cpu-clock:pppH >> PerfTop: 7214 irqs/sec kernel:100.0% exact: 0.0% [4000Hz cpu-clock:pppH], (all, 2 CPUs) >> >> It seems that perf top falls back to cpu-clock in the old case, but not >> once sampling events are not supported... >> >> -- >> Stefan >> >> >> drivers/perf/arm_pmu_platform.c | 19 +++++++++++-------- >> 1 file changed, 11 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c >> index 933bd8410fc2..80b991417b6e 100644 >> --- a/drivers/perf/arm_pmu_platform.c >> +++ b/drivers/perf/arm_pmu_platform.c >> @@ -105,23 +105,26 @@ static int pmu_parse_irqs(struct arm_pmu *pmu) >> return num_irqs; >> } >> >> + if (num_irqs == 1) { >> + int irq = platform_get_irq(pdev, 0); >> + if (irq && irq_is_percpu_devid(irq)) >> + return pmu_parse_percpu_irq(pmu, irq); >> + } >> + >> /* >> * In this case we have no idea which CPUs are covered by the PMU. >> * To match our prior behaviour, we assume all CPUs in this case. >> + * Multiple CPUs with a single PMU irq are currently not handled. >> + * Rather than supporting only the first CPU, support all CPUs but >> + * without interrupt capability. >> */ >> - if (num_irqs == 0) { >> - pr_warn("no irqs for PMU, sampling events not supported\n"); >> + if (num_irqs == 0 || (nr_cpu_ids > 1 && num_irqs == 1)) { >> + pr_info("No per CPU irqs for PMU, sampling events not supported\n"); >> pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT; >> cpumask_setall(&pmu->supported_cpus); >> return 0; >> } >> >> - if (num_irqs == 1) { >> - int irq = platform_get_irq(pdev, 0); >> - if (irq && irq_is_percpu_devid(irq)) >> - return pmu_parse_percpu_irq(pmu, irq); >> - } >> - >> if (nr_cpu_ids != 1 && !pmu_has_irq_affinity(pdev->dev.of_node)) { >> pr_warn("no interrupt-affinity property for %pOF, guessing.\n", >> pdev->dev.of_node); >> -- >> 2.20.1 >>
Powered by blists - more mailing lists