linux-kernel - Re: [PATCH] perf: RISC-V: Check standard event availability

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240424-7b4434122a8c1cf1c1173403@orel>
Date: Wed, 24 Apr 2024 15:31:46 +0200
From: Andrew Jones <ajones@...tanamicro.com>
To: Atish Kumar Patra <atishp@...osinc.com>
Cc: Ian Rogers <irogers@...gle.com>, Anup Patel <anup@...infault.org>, 
	Samuel Holland <samuel.holland@...ive.com>, Atish Patra <atishp@...shpatra.org>, 
	Albert Ou <aou@...s.berkeley.edu>, Mark Rutland <mark.rutland@....com>, 
	Palmer Dabbelt <palmer@...belt.com>, Paul Walmsley <paul.walmsley@...ive.com>, 
	Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH] perf: RISC-V: Check standard event availability

On Tue, Apr 23, 2024 at 05:36:43PM -0700, Atish Kumar Patra wrote:
> On Mon, Apr 22, 2024 at 8:44 PM Ian Rogers <irogers@...gle.com> wrote:
> >
> > On Mon, Apr 15, 2024 at 9:07 PM Anup Patel <anup@...infault.org> wrote:
> > >
> > > On Tue, Apr 16, 2024 at 5:31 AM Atish Patra <atishp@...osinc.com> wrote:
> > > >
> > > > On 4/10/24 18:40, Samuel Holland wrote:
> > > > > Hi Atish,
> > > > >
> > > > > On 2024-03-18 2:44 PM, Atish Patra wrote:
> > > > >> On Wed, Jan 3, 2024 at 8:54 AM Samuel Holland <samuel.holland@...ive.com> wrote:
> > > > >>>
> > > > >>> The RISC-V SBI PMU specification defines several standard hardware and
> > > > >>> cache events. Currently, all of these events appear in the `perf list`
> > > > >>> output, even if they are not actually implemented. Add logic to check
> > > > >>> which events are supported by the hardware (i.e. can be mapped to some
> > > > >>> counter), so only usable events are reported to userspace.
> > > > >>>
> > > > >>
> > > > >> Thanks for the patch.
> > > > >> This adds tons of SBI calls at every boot for a use case which is at
> > > > >> best confusing for a subset of users who actually wants to run perf.
> > > > >
> > > > > I should have been clearer in the patch description. This is not just a cosmetic
> > > > > change; because perf sees these as valid events, it tries to use them in
> > > > > commands like `perf stat`. When the error from SBI_EXT_PMU_COUNTER_CFG_MATCH
> > > > > causes the ->add() callback to fail, this prevents any other events from being
> > > > > scheduled on that same CPU (search can_add_hw in kernel/events/core.c). That is
> > > > > why the dTLB/iTLB miss counts are missing in the "before" example below.
> > > > >
> > > >
> > > > Thanks for explaining the problem. I can reproduce it in qemu as well if
> > > > enough number of invalid events given on the command line and the
> > > > workload is short enough.
> > > >
> > > > >> This probing can be done at runtime by invoking the
> > > > >> pmu_sbi_check_event from pmu_sbi_event_map.
> > > > >> We can update the event map after that so that it doesn't need to
> > > > >> invoke pmu_sbi_check_event next time.
> > > > >
> > > > > I tried to implement this suggestion, but it did not work. The SBI interface
> > > > > does not distinguish between "none of the counters can monitor the specified
> > > > > event [because the event is unsupported]" and "none of the counters can monitor
> > > > > the specified event [because the counters are busy]". It is not sufficient for
> > > > > the kernel to verify that at least one counter is available before performing
> > > > > the check, because certain events may only be usable on a subset of counters
> > > > > (per riscv,event-to-mhpmcounters), and the kernel does not know that mapping.
> > > > >
> > > >
> > > > Yeah. My suggestion was to fix the perf list issue which is different
> > > > than the issue reported now.
> > > >
> > > > > As a result, checking for event support is only reliable when none of the
> > > > > counters are in use. So the check can be asynchronous/deferred to later in the
> > > > > boot process, but I believe it still needs to be performed for all events before
> > > > > userspace starts using the counters.
> > > > >
> > > >
> > > > We should defer it a work queue for sure. We can also look at improving
> > > > SBI PMU extension to support bulk matching behavior as well.
> > > >
> > > > However, I think a better solution would be to just rely on the json
> > > > file mappings instead of making SBI calls. We are going to have the
> > > > event encoding and mappings in the json in the future.
> > >
> > > The problem with JSON based event encoding is how to deal in-case
> > > we are running inside Guest/VM because Host could be anything.
> > >
> > > IMO, the JSON based approach is not suitable for SBI PMU. For now,
> > > we either defer the work using the work queue or keep the approach
> > > of this patch as-is.
> > >
> > > The good part about SBI PMU extension is that we do have a standard
> > > set of events and we only need a way to discover supported standard
> > > events with a minimum number of SBI calls. It is better to add a new
> > > SBI PMU call to assist supported event discovery which will also
> > > help us virtualize it.
> > >
> > > Regards,
> > > Anup
> >
> > +Ian Rogers
> >
> > `perf list` will already filter some events depending on whether the
> > PMU supports them, for example, legacy cache events. I think we can
> > extend this to json events.
> >
> 
> Yes. That's what I was thinking as well. However, that may be a
> problem in virtualization
> as Anup pointed out.
> 
> As per my understanding, cloud providers provide json files for VMs
> based on the host
> architecture and allow migration only between hosts with the same
> family of cpu. In RISC-V, the mapfile.csv works based on 3 registers
> indicating marchid, mimpid, and mvendorid. Thus, the json file has to
> be tied with the host machine it is going to be run.

This is also my understanding. That is, that cloud instances typically
dedicate CPUs to VMs and don't try to present CPU models to VMs which
don't exactly match the host's CPUs. The remaining concern would be if
the hypervisor doesn't emulate/passthrough everything the json describes
for the host CPU type.

However, this is just "typical" clouds. All bets are off for general
virtualization, as Anup points out.

> 
> We will end up doing the same if we only rely on the json file to
> filter events in the future. Please let me know if the assumption is
> incorrect.
> 
> If we allow a SBI call route to discover which events are supported,
> the guest can always support legacy events on any host even though it
> doesn't have a json file.

Yes, I think we need a solution which works even without a json file,
since a VM may use a special mvendorid,marchid,mimpid triplet to
describe a more generic CPU type. Unless we also create json files
for these VCPUs, or provide other event discovery mechanisms, then
the VMs will not have anything.

Thanks,
drew