lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 28 Oct 2021 11:30:29 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, acme@...hat.com, jolsa@...hat.com,
        kim.phillips@....com, namhyung@...nel.org, irogers@...gle.com
Subject: Re: [PATCH v1 00/13] perf/x86/amd: Add AMD Fam19h Branch Sampling support

On Wed, Sep 15, 2021 at 2:04 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Sep 14, 2021 at 10:55:12PM -0700, Stephane Eranian wrote:
> > On Thu, Sep 9, 2021 at 1:55 AM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Thu, Sep 09, 2021 at 12:56:47AM -0700, Stephane Eranian wrote:
> > > > This patch series adds support for the AMD Fam19h 16-deep branch sampling
> > > > feature as described in the AMD PPR Fam19h Model 01h Revision B1 section 2.1.13.
> > >
> > > Yay..
> > >
> > > > BRS interacts with the NMI interrupt as well. Because enabling BRS is expensive,
> > > > it is only activated after P event occurrences, where P is the desired sampling period.
> > > > At P occurrences of the event, the counter overflows, the CPU catches the NMI interrupt,
> > > > activates BRS for 16 branches until it saturates, and then delivers the NMI to the kernel.
> > >
> > > WTF... ?!? Srsly? You're joking right?
> > >
> >
> > As I said, this is because of the cost of running BRS usually for
> > millions of branches to keep only the last 16.
> > Running branch sampling in general on any arch is  never totally free.
>
> Holding up the NMI will disrupt the sampling of the other events, which
> is, IMO unacceptible and would require this event to be exclusive on the
> whole PMU, simply because sharing it doesn't work.
>
Sorry for the long delay, I have been very busy.

You are right on this. It would hold the NMI for 16 taken branches.
Making the event exclusive creates a problem with the NMI watchdog.
We can try to hack something in to allow NMI watchdog + the sampling
event and nothing else.

> (also, other NMI sources might object)
>
On AMD, there is also IBS op, IBS Fetch both firing on NMI. but that
is less of a concern because the instruction address is captured by IBS
and the interrupted IP is not useful. So the interrupt skid is not important.

> Also, by only having LBRs post overflow you can't apply LBR based
> analysis to other events, which seems quite limiting.
>
This is a very limited functionality designed to support basic sampling
primarily to support autoFDO where there is only one sampling event.

> This really seems like a very sub-optimal solution. I mean, it's awesome
> AMD gets branch records, but this seems a very poor solution.

For now, this is what we have. It is important to get some basic form of branch
sampling on Zen3 even if it is not perfect because it enables optimizations such
as autoFDO for compilers today. We have verified that autoFDO works well with
branch sampling on Zen3.

I hope it will improve in the future.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ