linux-kernel - Re: [PATCH] perf: arm_spe: Use Inner Shareable DSB when draining the buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6878fd559ca5c63a251c47c271df7c5e@kernel.org>
Date:   Mon, 19 Oct 2020 13:55:42 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     Alexandru Elisei <alexandru.elisei@....com>,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        linux-kernel@...r.kernel.org, james.morse@....com,
        julien.thierry.kdev@...il.com, suzuki.poulose@....com,
        catalin.marinas@....com, will@...nel.org
Subject: Re: [PATCH] perf: arm_spe: Use Inner Shareable DSB when draining the
 buffer

On 2020-10-19 13:24, Mark Rutland wrote:
> On Tue, Oct 06, 2020 at 05:13:31PM +0100, Alexandru Elisei wrote:
>> Hi Marc,
>> 
>> Thank you for having a look at the patch!
>> 
>> On 10/6/20 4:32 PM, Marc Zyngier wrote:
>> > Hi Alex,
>> >
>> > On Tue, 06 Oct 2020 16:05:20 +0100,
>> > Alexandru Elisei <alexandru.elisei@....com> wrote:
>> >> From ARM DDI 0487F.b, page D9-2807:
>> >>
>> >> "Although the Statistical Profiling Extension acts as another observer in
>> >> the system, for determining the Shareability domain of the DSB
>> >> instructions, the writes of sample records are treated as coming from the
>> >> PE that is being profiled."
>> >>
>> >> Similarly, on page D9-2801:
>> >>
>> >> "The memory type and attributes that are used for a write by the
>> >> Statistical Profiling Extension to the Profiling Buffer is taken from the
>> >> translation table entries for the virtual address being written to. That
>> >> is:
>> >> - The writes are treated as coming from an observer that is coherent with
>> >>   all observers in the Shareability domain that is defined by the
>> >>   translation tables."
>> >>
>> >> All the PEs are in the Inner Shareable domain, use a DSB ISH to make sure
>> >> writes to the profiling buffer have completed.
>> > I'm a bit sceptical of this change. The SPE writes are per-CPU, and
>> > all we are trying to ensure is that the CPU we are running on has
>> > drained its own queue of accesses.
>> >
>> > The accesses being made within the IS domain doesn't invalidate the
>> > fact that they are still per-CPU, because "the writes of sample
>> > records are treated as coming from the PE that is being profiled.".
>> >
>> > So why should we have an IS-wide synchronisation for accesses that are
>> > purely local?
>> 
>> I think I might have misunderstood how perf spe works. Below is my 
>> original train
>> of thought.
>> 
>> In the buffer management event interrupt we drain the buffer, and if 
>> the buffer is
>> full, we call arm_spe_perf_aux_output_end() -> perf_aux_output_end(). 
>> The comment
>> for perf_aux_output_end() says "Commit the data written by hardware 
>> into the ring
>> buffer by adjusting aux_head and posting a PERF_RECORD_AUX into the 
>> perf buffer.
>> It is the pmu driver's responsibility to observe ordering rules of the 
>> hardware,
>> so that all the data is externally visible before this is called." My 
>> conclusion
>> was that after we drain the buffer, the data must be visible to all 
>> CPUs.
> 
> FWIW, this reasoning sounds correct to me. The DSB NSH will be
> sufficient to drain the buffer, but we need the DSB ISH to ensure that
> it's visbile to other CPUs at the instant we call 
> perf_aux_output_end().

Right. I think I missed that last bit (and Alex's email at the same 
time).

> Otherwise, if CPU x is reading the ring-buffer written by CPU y, it
> might see the aux buffer pointers updated before the samples are
> viisble, and hence read junk from the buffer.
> 
> We can add a comment to that effect (or rework perf_aux_output_end()
> somehow to handle that ordering).

I'd rather this is done in perf_aux_output_end(), as a full blown DSB 
ISH
on guest entry is pretty harsh... It would also nicely split the 
responsibilities:

- KVM stops SPE and make sure the output is drained
- Perf makes the data visible to all CPUs

Thoughts?

         M.
-- 
Jazz is not dead. It just smells funny...