lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y77nswJ7gMWekXTt@hirez.programming.kicks-ass.net>
Date:   Wed, 11 Jan 2023 17:45:39 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Namhyung Kim <namhyung@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...nel.org>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Ravi Bangoria <ravi.bangoria@....com>, bpf@...r.kernel.org
Subject: Re: [PATCH 2/3] perf/core: Set data->sample_flags in
 perf_prepare_sample()

On Wed, Jan 11, 2023 at 01:54:54PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 10, 2023 at 12:06:00PM -0800, Namhyung Kim wrote:
> 
> > Another example, but in this case it's real, is ADDR.  We cannot update
> > the data->addr just because filtered_sample_type has PHYS_ADDR or
> > DATA_PAGE_SIZE as it'd lose the original value.
> 
> Hmm, how about something like so?
> 
> /*
>  * if (flags & s) flags |= d; // without branches
>  */
> static __always_inline unsigned long
> __cond_set(unsigned long flags, unsigned long s, unsigned long d)
> {
> 	return flags | (d * !!(flags & s));
> }
> 
> Then:
> 
> 	fst = sample_type;
> 	fst = __cond_set(fst, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP);
> 	fst = __cond_set(fst, PERF_SAMPLE_DATA_PAGE_SIZE |
> 			      PERF_SAMPLE_PHYS_ADDR,	  PERF_SAMPLE_ADDR);
> 	fst = __cond_set(fst, PERF_SAMPLE_STACK_USER,     PERF_SAMPLE_REGS_USER);
> 	fst &= ~data->sample_flags;
> 

Hmm, I think it's better to write this like:

static __always_inline unsigned long
__cond_set(unsigned long flags, unsigned long s, unsigned long d)
{
	return d * !!(flags & s);
}

	fst = sample_type;
	fst |= __cond_set(sample_type, PERF_SAMPLE_CODE_PAGE_SIZE, PERF_SAMPLE_IP);
	fst |= __cond_set(sample_type, PERF_SAMPLE_DATA_PAGE_SIZE |
			               PERF_SAMPLE_PHYS_ADDR,	   PERF_SAMPLE_ADDR);
	fst |= __cond_set(sample_type, PERF_SAMPLE_STACK_USER,     PERF_SAMPLE_REGS_USER);
	fst &= ~data->sample_flags;

Which should be identical but has less data dependencies and thus gives
an OoO CPU more leaway to paralleize things.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ