[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAM9d7cjARKJ7Xj93k00KccJ+UQtytZn-g8nbWpz9nfT9s2nkhQ@mail.gmail.com>
Date: Wed, 11 Jan 2023 09:59:13 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...nel.org>,
Kan Liang <kan.liang@...ux.intel.com>,
Ravi Bangoria <ravi.bangoria@....com>, bpf@...r.kernel.org
Subject: Re: [PATCH 2/3] perf/core: Set data->sample_flags in perf_prepare_sample()
On Wed, Jan 11, 2023 at 8:45 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Jan 11, 2023 at 01:54:54PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 10, 2023 at 12:06:00PM -0800, Namhyung Kim wrote:
> >
> > > Another example, but in this case it's real, is ADDR. We cannot update
> > > the data->addr just because filtered_sample_type has PHYS_ADDR or
> > > DATA_PAGE_SIZE as it'd lose the original value.
> >
> > Hmm, how about something like so?
> >
> > /*
> > * if (flags & s) flags |= d; // without branches
> > */
> > static __always_inline unsigned long
> > __cond_set(unsigned long flags, unsigned long s, unsigned long d)
> > {
> > return flags | (d * !!(flags & s));
> > }
> >
> > Then:
> >
> > fst = sample_type;
> > fst = __cond_set(fst, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP);
> > fst = __cond_set(fst, PERF_SAMPLE_DATA_PAGE_SIZE |
> > PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR);
> > fst = __cond_set(fst, PERF_SAMPLE_STACK_USER, PERF_SAMPLE_REGS_USER);
> > fst &= ~data->sample_flags;
> >
>
> Hmm, I think it's better to write this like:
>
> static __always_inline unsigned long
> __cond_set(unsigned long flags, unsigned long s, unsigned long d)
> {
> return d * !!(flags & s);
> }
>
> fst = sample_type;
> fst |= __cond_set(sample_type, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP);
> fst |= __cond_set(sample_type, PERF_SAMPLE_DATA_PAGE_SIZE |
> PERF_SAMPLE_PHYS_ADDR, PERF_SAMPLE_ADDR);
> fst |= __cond_set(sample_type, PERF_SAMPLE_STACK_USER, PERF_SAMPLE_REGS_USER);
> fst &= ~data->sample_flags;
>
> Which should be identical but has less data dependencies and thus gives
> an OoO CPU more leaway to paralleize things.
Looks good. Let me try this.
Thanks,
Namhyung
Powered by blists - more mailing lists