[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a99c41fb-40f6-a0e2-af6b-22d7e6ac9b3d@arm.com>
Date: Thu, 13 Jun 2019 08:10:45 +0100
From: Raphael Gault <raphael.gault@....com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Mark Rutland <mark.rutland@....com>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>
Subject: Re: [PATCH 3/7] perf: arm64: Use rseq to test userspace access to pmu
counters
Hi Mathieu,
On 6/11/19 8:33 PM, Mathieu Desnoyers wrote:
> ----- On Jun 11, 2019, at 6:57 PM, Mark Rutland mark.rutland@....com wrote:
>
>> Hi Arnaldo,
>>
>> On Tue, Jun 11, 2019 at 11:33:46AM -0300, Arnaldo Carvalho de Melo wrote:
>>> Em Tue, Jun 11, 2019 at 01:53:11PM +0100, Raphael Gault escreveu:
>>>> Add an extra test to check userspace access to pmu hardware counters.
>>>> This test doesn't rely on the seqlock as a synchronisation mechanism but
>>>> instead uses the restartable sequences to make sure that the thread is
>>>> not interrupted when reading the index of the counter and the associated
>>>> pmu register.
>>>>
>>>> In addition to reading the pmu counters, this test is run several time
>>>> in order to measure the ratio of failures:
>>>> I ran this test on the Juno development platform, which is big.LITTLE
>>>> with 4 Cortex A53 and 2 Cortex A57. The results vary quite a lot
>>>> (running it with 100 tests is not so long and I did it several times).
>>>> I ran it once with 10000 iterations:
>>>> `runs: 10000, abort: 62.53%, zero: 34.93%, success: 2.54%`
>>>>
>>>> Signed-off-by: Raphael Gault <raphael.gault@....com>
>>>> ---
>>>> tools/perf/arch/arm64/include/arch-tests.h | 5 +-
>>>> tools/perf/arch/arm64/include/rseq-arm64.h | 220 ++++++++++++++++++
>>>
>>> So, I applied the first patch in this series, but could you please break
>>> this patch into at least two, one introducing the facility
>>> (include/rseq*) and the second adding the test?
>>>
>>> We try to enforce this kind of granularity as down the line we may want
>>> to revert one part while the other already has other uses and thus
>>> wouldn't allow a straight revert.
>>>
>>> Also, can this go to tools/arch/ instead? Is this really perf specific?
>>> Isn't there any arch/arm64/include files for the kernel that we could
>>> mirror and have it checked for drift in tools/perf/check-headers.sh?
>>
>> The rseq bits aren't strictly perf specific, and I think the existing
>> bits under tools/testing/selftests/rseq/ could be factored out to common
>> locations under tools/include/ and tools/arch/*/include/.
>
> Hi Mark,
>
> Thanks for CCing me!
>
> Or into a stand-alone librseq project:
>
> https://github.com/compudj/librseq (currently a development branch in
> my own github)
>
> I don't see why this user-space code should sit in the kernel tree.
> It is not tooling-specific.
>
>>
>> From a scan, those already duplicate barriers and other helpers which
>> already have definitions under tools/, which seems unfortunate. :/
>>
>> Comments below are for Raphael and Matthieu.
>>
>> [...]
>>
>>>> +static u64 noinline mmap_read_self(void *addr, int cpu)
>>>> +{
>>>> + struct perf_event_mmap_page *pc = addr;
>>>> + u32 idx = 0;
>>>> + u64 count = 0;
>>>> +
>>>> + asm volatile goto(
>>>> + RSEQ_ASM_DEFINE_TABLE(0, 1f, 2f, 3f)
>>>> + "nop\n"
>>>> + RSEQ_ASM_STORE_RSEQ_CS(1, 0b, rseq_cs)
>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>>>> + RSEQ_ASM_OP_R_LOAD(pc_idx)
>>>> + RSEQ_ASM_OP_R_AND(0xFF)
>>>> + RSEQ_ASM_OP_R_STORE(idx)
>>>> + RSEQ_ASM_OP_R_SUB(0x1)
>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>>>> + "msr pmselr_el0, " RSEQ_ASM_TMP_REG "\n"
>>>> + "isb\n"
>>>> + RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 3f)
>
> I really don't understand why the cpu_id needs to be compared 3 times
> here (?!?)
>
> Explicit comparison of the cpu_id within the rseq critical section
> should be done _once_.
>
I understand and that's what I thought as well but I got confused with a
comment in (src)/include/uapi/linux/rseq.h which states:
> This CPU number value should always be compared
> against the value of the cpu_id field before performing a rseq
> commit or returning a value read from a data structure indexed
> using the cpu_id_start value.
I'll remove the unnecessary testing.
> If the kernel happens to preempt and migrate the thread while in the
> critical section, it's the kernel's job to move user-space execution
> to the abort handler.
>
[...]
Thanks,
--
Raphael Gault
Powered by blists - more mailing lists