linux-kernel - Re: [PATCH v4 17/39] unwind_user/sframe: Add support for reading .sframe headers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12cef882-b5b2-43e5-9d78-abe4354069dd@oracle.com>
Date: Thu, 30 Jan 2025 13:21:21 -0800
From: Indu Bhagat <indu.bhagat@...cle.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Josh Poimboeuf <jpoimboe@...nel.org>
Cc: x86@...nel.org, Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        linux-kernel@...r.kernel.org, Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
        Ian Rogers <irogers@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
        linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
        Sam James <sam@...too.org>, linux-trace-kernel@...r.kernel.org,
        Jens Remus <jremus@...ux.ibm.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Florian Weimer <fweimer@...hat.com>, Andy Lutomirski <luto@...nel.org>,
        Masami Hiramatsu <mhiramat@...nel.org>, Weinan Liu <wnliu@...gle.com>
Subject: Re: [PATCH v4 17/39] unwind_user/sframe: Add support for reading
 .sframe headers

On 1/27/25 5:10 PM, Andrii Nakryiko wrote:
>>>>> +struct sframe_preamble {
>>>>> +       u16     magic;
>>>>> +       u8      version;
>>>>> +       u8      flags;
>>>>> +} __packed;
>>>>> +
>>>>> +struct sframe_header {
>>>>> +       struct sframe_preamble preamble;
>>>>> +       u8      abi_arch;
>>>>> +       s8      cfa_fixed_fp_offset;
>>>>> +       s8      cfa_fixed_ra_offset;
>>>>> +       u8      auxhdr_len;
>>>>> +       u32     num_fdes;
>>>>> +       u32     num_fres;
>>>>> +       u32     fre_len;
>>>>> +       u32     fdes_off;
>>>>> +       u32     fres_off;
>>>>> +} __packed;
>>>>> +
>>>>> +struct sframe_fde {
>>>>> +       s32     start_addr;
>>>>> +       u32     func_size;
>>>>> +       u32     fres_off;
>>>>> +       u32     fres_num;
>>>>> +       u8      info;
>>>>> +       u8      rep_size;
>>>>> +       u16 padding;
>>>>> +} __packed;
>>>> I couldn't understand from SFrame itself, but why do sframe_header,
>>>> sframe_preamble, and sframe_fde have to be marked __packed, if it's
>>>> all naturally aligned (intentionally and by design)?..
>>> Right, but the spec says they're all packed.  Maybe the point is that
>>> some future sframe version is free to introduce unaligned fields.
>>>
>> SFrame specification aims to keep SFrame header and SFrame FDE members
>> at aligned boundaries in future versions.
>>
>> Only SFrame FRE related accesses may have unaligned accesses.
> Yeah, and it's actually bothering me quite a lot 🙂 I have a tentative
> proposal, maybe we can discuss this for SFrame v3? Let me briefly
> outline the idea.
> 

I looked at the idea below.  It could work wrt unaligned accesses.

Speaking of unaligned accesses, I will ask away: Is the reason to avoid 
unaligned accesses performance hit or are there other practical reasons 
to it ?

> So, currently in v2, FREs within FDEs use an array-of-structs layout.
> If we use preudo-C type definitions, it would be something like this
> for FDE + its FREs:
> 
> struct FDE_and_FREs {
>      struct sframe_func_desc_entry fde_metadata;
> 
>      union FRE {
>          struct FRE8 {
>              u8 sfre_start_address;
>              u8 sfre_info;
>              u8|u16|u32 offsets[M];
>          }
>          struct FRE16 {
>              u16 sfre_start_address;
>              u16 sfre_info;
>              u8|u16|u32 offsets[M];
>          }
>          struct FRE32 {
>              u32 sfre_start_address;
>              u32 sfre_info;
>              u8|u16|u32 offsets[M];
>          }
>      } fres[N] __packed;
> };
> 
> where all fres[i]s are one of those FRE8/FRE16/FRE32, so start
> addresses have the same size, but each FRE has potentially different
> offsets sizing, so there is no common alignment, and so everything has
> to be packed and unaligned.
> 
> But what if we take a struct-of-arrays approach and represent it more like:
> 
> struct FDE_and_FREs {
>      struct sframe_func_desc_entry fde_metadata;
>      u8|u16|u32 start_addrs[N]; /* can extend to u64 as well */
>      u8 sfre_infos[N];
>      u8 offsets8[M8];
>      u16 offsets16[M16] __aligned(2);
>      u32 offsets32[M32] __aligned(4);
>      /* we can naturally extend to support also u64 offsets */
> };
> 
> i.e., we split all FRE records into their three constituents: start
> addresses, info bytes, and then each FRE can fall into either 8-, 16-,
> or 32-bit offsets "bucket". We collect all the offsets, depending on
> their size, into these aligned offsets{8,16,32} arrays (with natural
> extension to 64 bits, if necessary), with at most wasting 1-3 bytes to
> ensure proper alignment everywhere.
> 
> Note, at this point we need to decide if we want to make FREs binary
> searchable or not.
> 
> If not, we don't really need anything extra. As we process each
> start_addrs[i] and sfre_infos[i] to find matching FRE, we keep track
> of how many 8-, 16-, and 32-bit offsets already processed FREs
> consumed, and when we find the right one, we know exactly the starting
> index within offset{8,16,32}. Done.
> 
> But if we were to make FREs binary searchable, we need to basically
> have an index of offset pointers to quickly find offsetsX[j] position
> corresponding to FRE #i. For that, we can have an extra array right
> next to start_addrs, "semantically parallel" to it:
> 
> u8|u16|u32 start_addrs[N];
> u8|u16|u32 offset_idxs[N];
> 
> where start_addrs[i] corresponds to offset_idxs[i], and offset_idxs[i]
> points to the first offset corresponding to FRE #i in offsetX[] array
> (depending on FRE's "bitness"). This is a bit more storage for this
> offset index, but for FDEs with lots of FREs this might be a
> worthwhile tradeoff.
> 
> Few points:
>    a) we can decide this "binary searchability" per-FDE, and for FDEs
> with 1-2-3 FREs not bother, while those with more FREs would be
> searchable ones with index. So we can combine both fast lookups,
> natural alignment of on-disk format, and compactness. The presence of
> index is just another bit in FDE metadata.

I have been going back and forth on this one. So there seem to be the 
following options here:
   #1. Make "binary searchability" a per-FDE decision.
   #2. Make "binary searchability" a per-section decision (I expect 
aarch64 to have very low number of FREs per FDE).
   #3. Bake "binary searchability" into the SFrame FRE specification. 
So its always ON for all FDEs.  The advantage is that it makes stack 
tracers simpler to implement with less code.

I do think #2, #3 appear simpler in concept.

>    b) bitness of offset_idxs[] can be coupled with bitness of
> start_addrs (for simplicity), or could be completely independent and
> identified by FDE's metadata (2 more bits to define this just like
> start_addr bitness is defined). Independent probably would be my
> preference, with linker (or whoever will be producing .sframe data)
> can pick the smallest bitness that is sufficient to represent
> everything.
> 

ATM, GAS does apply special logic to decide the bitness of start_addrs 
per function, and ld just uses that info.  Coupling the bitness of 
offset_idx with bitness of start_addrs will be easy (or _easier_ I 
think), but for now, I leave it as "should be doable" :)

> Yes, it's a bit more complicated to draw and explain, but everything
> will be nicely aligned, extensible to 64 bits, and (optionally at
> least) binary searchable. Implementation-wise on the kernel side it
> shouldn't be significantly more involved. Maybe the compiler would
> need to be a bit smarter when producing FDE data, but it's no rocket
> science.
> 
> Thoughts?

Combining the requirements from your email and Josh's follow up:
   - No unaligned accesses
   - Sorted FREs

I would put compaction as a "good to have" requirement.  It appears to 
me that any compaction will mean a sort of post-processing which will 
interfere with JIT usecase.