lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <024b7bb4-731e-4da4-8480-4789f5912977@linux.ibm.com>
Date: Tue, 16 Dec 2025 10:29:28 +0100
From: Jens Remus <jremus@...ux.ibm.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ian Rogers <irogers@...gle.com>, James Clark <james.clark@...aro.org>,
        Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
        Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, linux-perf-users@...r.kernel.org,
        Steven Rostedt <rostedt@...dmis.org>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Indu Bhagat <indu.bhagat@...cle.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org,
        Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>
Subject: Re: [PATCH v6 4/6] perf script: Display
 PERF_RECORD_CALLCHAIN_DEFERRED

Hello Namhyung!

On 12/16/2025 5:48 AM, Namhyung Kim wrote:
> On Fri, Dec 12, 2025 at 01:11:38PM +0100, Jens Remus wrote:

>> following is an observation from my attempt to enable unwind user fp on
>> s390 using s390 back chain instead of frame pointer and relaxing the
>> s390-specific IP validation check.
>>
>> When capturing call graphs of a Java application the list of "unwound"
>> user space IPs may contain invalid entries, such as 0x0, 0xdeaddeaf,
>> and 0xffffffffffffff.  IPs that exceed PERF_CONTEXT_MAX, such as the
>> latter, cause perf not to display any deferred (or merged) call chain.
>> Note that this is not caused by your patch series.
> 
> Right, it's not a real IP so perf ABI treats them as a magic context.
> 
>>
>> While re-adding the s390-specific IP checks would "hide" those, I found
>> that the call graphs look good otherwise.  That is the back chain seems
>> to be intact.  It is just the user space application (e.g. Java JRE) not
>> correctly adhering to the ABI and saving the return address to the
>> specified location on the stack, causing bogus IPs to be reported.
>>
>> Could perf be improved to handle those user space IPs that exceed
>> PERF_CONTEXT_MAX?
> 
> Ideally we should not have them in the first place.  Is it a JRE issue
> or your s390 unwinder issue?  Is it possible to ignore them in the
> unwinder?

Stack tracing using frame pointer is virtually impossible on s390, as
the ABI does not designate a specific register as FP register, does not
specify a fixed register save area layout, nor does mandate a FP to be
setup early.  Compilers usually setup a FP late, that is after static
stack allocation.

An alternative is the s390-specific back chain, which is basically a
frame pointer on stack.  The ABI specifics that *(SP+0) has the pointer
to the previous frame and *(BC-48) has the return address (RA), if a
back chain is used (e.g. compiler option -mbackchain is used).  This is
why I implemented unwind user fp on s390 using back chain.  Note that
the back chain can be correctly followed, even if the saved RAs are
bogus.  That is what can be observed in case of this specific Java JRE.
Apparently it correctly maintains the back chain stack slot, but does
not correctly maintain the RA stack slot.  So the RA stack save slot may
contain any random value.

The s390-implementation of unwind user fp could check whether the return
address is a valid IP.  This is how it is implemented in the existing
stack tracer in arch/s390/kernel/stacktrace.c:

static inline bool ip_invalid(unsigned long ip)
{
	/* ABI requires IPs to be 2-byte aligned */
	if (ip & 1)
		return true;
	if (ip < mmap_min_addr)
		return true;
	if (ip >= current->mm->context.asce_limit)
		return true;
	return false;
}

It could then either stop or return some magic value
(e.g. PERF_CONTEXT_MAX - 1) to indicate that the IP is invalid and
continue.  Actually I would prefer to continue so that a user an see
that there is something odd with the stack trace.

Alternatively such a check could possibly also be implemented in the
common undwind user, if the address space limits are known in common
code, or as an architecture-specific hook.  In general I tend to at
least add a check whether the IP is zero, as this is used on several
architectures as indication for outermost frames (usually in
combination with a FP of zero).

>>
>> Is there otherwise guidance how unwind user and/or the s390
>> implementation should deal with such IPs?  Should it stop taking the
>> deferred calltrace?  Should it substitute those with e.g 0, so that
>> perf can display them?


>> Sample for IP == ffffffffffffff (perf does not display any call chain):
...
>> # perf report -D
>> ...
>> 44004346257 0x17718 [0x40]: PERF_RECORD_SAMPLE(IP, 0x2): 1082/1084: 0x3ffa3e413aa period: 1001001 addr: 0
>> ... FP chain: nr:2
>> .....  0: fffffffffffffd80
>> .....  1: 0000000400000079
>> ...... (deferred)
>>  ... thread: java:1084
>>  ...... dso: /tmp/perf-1082.map
>>
>> 0x17758@...f.data [0xd0]: event: 22
>> .
>> . ... raw event: size 208 bytes
...
>>
>> 44004348864 0x17758 [0xd0]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 1082/1084: 0x400000079
>> ... FP chain: nr:21
>> .....  0: 000003ffa3e413aa
>> .....  1: 000003ff3809e2d0
>> .....  2: 000003ff3809e130
>> .....  3: 000003ffb95fdf68
>> .....  4: 0000000000000000
>> .....  5: 000003ffb95fe128
>> .....  6: 000003ffb95fe1d0
>> .....  7: 005780888e7647a5
>> .....  8: 000003ffa3e437f2
>> .....  9: ffffffffffffffff <-- !
>> ..... 10: 000003ffa3e4a1fc
>> ..... 11: 0000000000000000
>> ..... 12: 000003ffa3e37900
>> ..... 13: 000003ffa3e41080
>> ..... 14: 000003ffb9dd11de
>> ..... 15: 000003ffb9e8df92
>> ..... 16: 000003ffb9e90e86
>> ..... 17: 000003ffbab8b07e
>> ..... 18: 000003ffbab8e040
>> ..... 19: 000003ffba8abbd8
>> ..... 20: 000003ffba92b950
>> : unhandled!
>>
>> ...
>> [next entry]
>>
>>
>> On 11/21/2025 12:48 AM, Namhyung Kim wrote:

>>> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
>>
>>> +static int process_deferred_sample_event(const struct perf_tool *tool,
>>> +					 union perf_event *event,
>>> +					 struct perf_sample *sample,
>>> +					 struct evsel *evsel,
>>> +					 struct machine *machine)
>>> +{
>>
>>> +	perf_sample__fprintf_start(scr, sample, al.thread, evsel,
>>> +				   PERF_RECORD_CALLCHAIN_DEFERRED, fp);
>>> +	fprintf(fp, "DEFERRED CALLCHAIN [cookie: %llx]",
>>> +		(unsigned long long)event->callchain_deferred.cookie);
>>> +
>>> +	if (PRINT_FIELD(IP)) {
>>> +		struct callchain_cursor *cursor = NULL;
>>> +
>>> +		if (symbol_conf.use_callchain && sample->callchain) {
>>> +			cursor = get_tls_callchain_cursor();
>>> +			if (thread__resolve_callchain(al.thread, cursor, evsel,
>>> +						      sample, NULL, NULL,
>>> +						      scripting_max_stack)) {
>>
>> thread__resolve_callchain()
>> calls __thread__resolve_callchain()
>> calls thread__resolve_callchain_sample():
>>
>>         for (i = first_call, nr_entries = 0;
>>              i < chain_nr && nr_entries < max_stack; i++) {
>> ...
>>                 ip = chain->ips[j];
>>                 if (ip < PERF_CONTEXT_MAX)   <-- IP=ff..ff is greater than PERF_CONTEXT_MAX
>>                        ++nr_entries;
> 
> Right.
> 
>> ...
>>                 err = add_callchain_ip(thread, cursor, parent,
>>                                        root_al, &cpumode, ip,
>>                                        false, NULL, NULL, 0, symbols);
>>
>>                 if (err)
>>                         return (err < 0) ? err : 0;
>>
>> calls add_callchain_ip:
>>
>>                if (ip >= PERF_CONTEXT_MAX) {
>>                        switch (ip) {
>>                        case PERF_CONTEXT_HV:
>>                                *cpumode = PERF_RECORD_MISC_HYPERVISOR;
>>                                break;
>>                        case PERF_CONTEXT_KERNEL:
>>                                *cpumode = PERF_RECORD_MISC_KERNEL;
>>                                break;
>>                        case PERF_CONTEXT_USER:
>>                        case PERF_CONTEXT_USER_DEFERRED:
>>                                *cpumode = PERF_RECORD_MISC_USER;
>>                                break;
>>                        default:
>>                                pr_debug("invalid callchain context: "  <-- IP=ff..ff reaches default case
>>                                         "%"PRId64"\n", (s64) ip);
> 
> We may skip -1 if it's Java and *cpumode is already USER and it's s390.
> But I'd like to understand the situation first.

Let's better not add any weird architecture-specific handling.  This is
also not limited to -1 (and 0), as Java may have used the stack save
area in any way, so it may be any random value.

>>                                /*
>>                                 * It seems the callchain is corrupted.
>>                                 * Discard all.
>>                                 */
>>                                callchain_cursor_reset(cursor);
>>                                err = 1;
>>                                goto out;
>>                        }
>>
>>> +				pr_info("cannot resolve deferred callchains\n");
>>> +				cursor = NULL;
>>> +			}
>>> +		}
>>> +
>>> +		fputc(cursor ? '\n' : ' ', fp);
>>> +		sample__fprintf_sym(sample, &al, 0, output[type].print_ip_opts,
>>> +				    cursor, symbol_conf.bt_stop_list, fp);
>>> +	}

Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@...ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ