lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <2fa31347-3021-4604-bec3-e5a2d57b77b5@efficios.com>
Date: Mon, 21 Jul 2025 11:20:34 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: rostedt <rostedt@...dmis.org>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 bpf@...r.kernel.org, x86@...nel.org, Masami Hiramatsu <mhiramat@...nel.org>,
 Josh Poimboeuf <jpoimboe@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
 Andrii Nakryiko <andrii@...nel.org>, Indu Bhagat <indu.bhagat@...cle.com>,
 "Jose E. Marchesi" <jemarch@....org>,
 Beau Belgrave <beaub@...ux.microsoft.com>, Jens Remus
 <jremus@...ux.ibm.com>, Linus Torvalds <torvalds@...ux-foundation.org>,
 Andrew Morton <akpm@...ux-foundation.org>, Jens Axboe <axboe@...nel.dk>,
 Florian Weimer <fweimer@...hat.com>, Sam James <sam@...too.org>,
 Brian Robbins <brianrob@...rosoft.com>,
 Elena Zannoni <elena.zannoni@...cle.com>
Subject: [RFC] New codectl(2) system call for sframe registration

Hi!

I've written up an RFC for a new system call to handle sframe registration
for shared libraries. There has been interest to cover both sframe in
the short term, but also JIT use-cases in the long term, so I'm
covering both here in this RFC to provide the full context. Implementation
wise we could start by only covering the sframe use-case.

I've called it "codectl(2)" for now, but I'm of course open to feedback.

For ELF, I'm including the optional pathname, build id, and debug link
information which are really useful to translate from instruction pointers
to executable/library name, symbol, offset, source file, line number.
This is what we are using in LTTng-UST and Babeltrace debug-info filter
plugin [1], and I think this would be relevant for kernel tracers as well
so they can make the resulting stack traces meaningful to users.

sys_codectl(2)
=================

* arg0: unsigned int @option:

/* Additional labels can be added to enum code_opt, for extensibility. */

enum code_opt {
     CODE_REGISTER_ELF,
     CODE_REGISTER_JIT,
     CODE_UNREGISTER,
};

* arg1: void * @info

/* if (@option == CODE_REGISTER_ELF) */

/*
  * text_start, text_end, sframe_start, sframe_end allow unwinding of the
  * call stack.
  *
  * elf_start, elf_end, pathname, and either build_id or debug_link allows
  * mapping instruction pointers to file, symbol, offset, and source file
  * location.
  */
struct code_elf_info {
:   __u64 elf_start;
     __u64 elf_end;
     __u64 text_start;
     __u64 text_end;
     __u64 sframe_start;
     __u64 sframe_end;
     __u64 pathname;              /* char *, NULL if unavailable. */

     __u64 build_id;              /* char *, NULL if unavailable. */
     __u64 debug_link_pathname;   /* char *, NULL if unavailable. */
     __u32 build_id_len;
     __u32 debug_link_crc;
};


/* if (@option == CODE_REGISTER_JIT) */

/*
  * Registration of sorted JIT unwind table: The reserved memory area is
  * of size reserved_len. Userspace increases used_len as new code is
  * populated between text_start and text_end. This area is populated in
  * increasing address order, and its ABI requires to have no overlapping
  * fre. This fits the common use-case where JITs populate code into
  * a given memory area by increasing address order. The sorted unwind
  * tables can be chained with a singly-linked list as they become full.
  * Consecutive chained tables are also in sorted text address order.
  *
  * Note: if there is an eventual use-case for unsorted jit unwind table,
  * this would be introduced as a new "code option".
  */

struct code_jit_info {
     __u64 text_start;      /* text_start >= addr */
     __u64 text_end;        /* addr < text_end */
     __u64 unwind_head;     /* struct code_jit_unwind_table * */
};

struct code_jit_unwind_fre {
     /*
      * Contains info similar to sframe, allowing unwind for a given
      * code address range.
      */
     __u32 size;
     __u32 ip_off;  /* offset from text_start */
     __s32 cfa_off;
     __s32 ra_off;
     __s32 fp_off;
     __u8 info;
};

struct code_jit_unwind_table {
     __u64 reserved_len;
     __u64 used_len; /*
                      * Incremented by userspace (store-release), read by
                      * the kernel (load-acquire).
                      */
     __u64 next;     /* Chain with next struct code_jit_unwind_table. */
     struct code_jit_unwind_fre fre[];
};

/* if (@option == CODE_UNREGISTER) */

void *info

* arg2: size_t info_size

/*
  * Size of @info structure, allowing extensibility. See
  * copy_struct_from_user().
  */

* arg3: unsigned int flags (0)

/* Flags for extensibility. */

Your feedback is welcome,

Thanks,

Mathieu

[1] https://babeltrace.org/docs/v2.0/man7/babeltrace2-filter.lttng-utils.debug-info.7/

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ