lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aP1yM53r1GLS-767@google.com>
Date: Sat, 25 Oct 2025 17:58:27 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>,
	Adrian Hunter <adrian.hunter@...el.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
	Steven Rostedt <rostedt@...nel.org>, linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org,
	x86@...nel.org, Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrii Nakryiko <andrii@...nel.org>,
	Indu Bhagat <indu.bhagat@...cle.com>,
	"Jose E. Marchesi" <jemarch@....org>,
	Beau Belgrave <beaub@...ux.microsoft.com>,
	Jens Remus <jremus@...ux.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Florian Weimer <fweimer@...hat.com>, Sam James <sam@...too.org>,
	Kees Cook <kees@...nel.org>, Carlos O'Donell <codonell@...hat.com>
Subject: Re: [PATCH v16 0/4] perf: Support the deferred unwinding
 infrastructure

Hi Peter,

On Fri, Oct 24, 2025 at 02:58:41PM +0200, Peter Zijlstra wrote:
> 
> Arnaldo, Namhyung,
> 
> On Fri, Oct 24, 2025 at 10:26:56AM +0200, Peter Zijlstra wrote:
> 
> > > So "perf_iterate_sb()" was the key point I was missing. I'm guessing it's
> > > basically a demultiplexer that distributes events to all the requestors?
> > 
> > A superset. Basically every event in the relevant context that 'wants'
> > it.
> > 
> > It is what we use for all traditional side-band events (hence the _sb
> > naming) like mmap, task creation/exit, etc.
> > 
> > I was under the impression the perf tool would create one software dummy
> > event to listen specifically for these events per buffer, but alas, when
> > I looked at the tool this does not appear to be the case.
> > 
> > As a result it is possible to receive these events multiple times. And
> > since that is a problem that needs to be solved anyway, I didn't think
> > it 'relevant' in this case.
> 
> When I use:
> 
>   perf record -ag -e cycles -e instructions
> 
> I get:
> 
> # event : name = cycles, , id = { }, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 2000, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1, defer_callchain = 1
> # event : name = instructions, , id = { }, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0x1 (PERF_COUNT_HW_INSTRUCTIONS), { sample_period, sample_freq } = 2000, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1, defer_callchain = 1
> # event : name = dummy:u, , id = { }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, exclude_kernel = 1, exclude_hv = 1, mmap = 1, comm = 1, task = 1, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1, build_id = 1, defer_output = 1
> 
> And we have this dummy event I spoke of above; and it has defer_output
> set, none of the others do. This is what I expected.
> 
> *However*, when I use:
> 
>   perf record -g -e cycles -e instruction
> 
> I get:
> 
> # event : name = cycles, , id = { }, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq } = 2000, sample_type = IP|TID|TIME|CALLCHAIN|ID|PERIOD, read_format = ID|LOST, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, sample_id_all = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1, build_id = 1, defer_callchain = 1, defer_output = 1
> # event : name = instructions, , id = { }, type = 0 (PERF_TYPE_HARDWARE), size = 136, config = 0x1 (PERF_COUNT_HW_INSTRUCTIONS), { sample_period, sample_freq } = 2000, sample_type = IP|TID|TIME|CALLCHAIN|ID|PERIOD, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, sample_id_all = 1, defer_callchain = 1
> 
> Which doesn't have a dummy event. Notably the first real event has
> defer_output set (and all the other sideband stuff like mmap, comm,
> etc.).
> 
> Is there a reason the !cpu mode doesn't have the dummy event? Anyway, it
> should all work, just unexpected inconsistency that confused me. 

Right, I don't remember why.  I think there's no reason doing it for
system wide mode only.

Adrian, do you have any idea?  I have a vague memory that you worked on
this in the past.

Thanks,
Namhyung


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ