[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12cef882-b5b2-43e5-9d78-abe4354069dd@oracle.com>
Date: Thu, 30 Jan 2025 13:21:21 -0800
From: Indu Bhagat <indu.bhagat@...cle.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>,
Josh Poimboeuf <jpoimboe@...nel.org>
Cc: x86@...nel.org, Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>, Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
linux-kernel@...r.kernel.org, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
Sam James <sam@...too.org>, linux-trace-kernel@...r.kernel.org,
Jens Remus <jremus@...ux.ibm.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Florian Weimer <fweimer@...hat.com>, Andy Lutomirski <luto@...nel.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Weinan Liu <wnliu@...gle.com>
Subject: Re: [PATCH v4 17/39] unwind_user/sframe: Add support for reading
.sframe headers
On 1/27/25 5:10 PM, Andrii Nakryiko wrote:
>>>>> +struct sframe_preamble {
>>>>> + u16 magic;
>>>>> + u8 version;
>>>>> + u8 flags;
>>>>> +} __packed;
>>>>> +
>>>>> +struct sframe_header {
>>>>> + struct sframe_preamble preamble;
>>>>> + u8 abi_arch;
>>>>> + s8 cfa_fixed_fp_offset;
>>>>> + s8 cfa_fixed_ra_offset;
>>>>> + u8 auxhdr_len;
>>>>> + u32 num_fdes;
>>>>> + u32 num_fres;
>>>>> + u32 fre_len;
>>>>> + u32 fdes_off;
>>>>> + u32 fres_off;
>>>>> +} __packed;
>>>>> +
>>>>> +struct sframe_fde {
>>>>> + s32 start_addr;
>>>>> + u32 func_size;
>>>>> + u32 fres_off;
>>>>> + u32 fres_num;
>>>>> + u8 info;
>>>>> + u8 rep_size;
>>>>> + u16 padding;
>>>>> +} __packed;
>>>> I couldn't understand from SFrame itself, but why do sframe_header,
>>>> sframe_preamble, and sframe_fde have to be marked __packed, if it's
>>>> all naturally aligned (intentionally and by design)?..
>>> Right, but the spec says they're all packed. Maybe the point is that
>>> some future sframe version is free to introduce unaligned fields.
>>>
>> SFrame specification aims to keep SFrame header and SFrame FDE members
>> at aligned boundaries in future versions.
>>
>> Only SFrame FRE related accesses may have unaligned accesses.
> Yeah, and it's actually bothering me quite a lot 🙂 I have a tentative
> proposal, maybe we can discuss this for SFrame v3? Let me briefly
> outline the idea.
>
I looked at the idea below. It could work wrt unaligned accesses.
Speaking of unaligned accesses, I will ask away: Is the reason to avoid
unaligned accesses performance hit or are there other practical reasons
to it ?
> So, currently in v2, FREs within FDEs use an array-of-structs layout.
> If we use preudo-C type definitions, it would be something like this
> for FDE + its FREs:
>
> struct FDE_and_FREs {
> struct sframe_func_desc_entry fde_metadata;
>
> union FRE {
> struct FRE8 {
> u8 sfre_start_address;
> u8 sfre_info;
> u8|u16|u32 offsets[M];
> }
> struct FRE16 {
> u16 sfre_start_address;
> u16 sfre_info;
> u8|u16|u32 offsets[M];
> }
> struct FRE32 {
> u32 sfre_start_address;
> u32 sfre_info;
> u8|u16|u32 offsets[M];
> }
> } fres[N] __packed;
> };
>
> where all fres[i]s are one of those FRE8/FRE16/FRE32, so start
> addresses have the same size, but each FRE has potentially different
> offsets sizing, so there is no common alignment, and so everything has
> to be packed and unaligned.
>
> But what if we take a struct-of-arrays approach and represent it more like:
>
> struct FDE_and_FREs {
> struct sframe_func_desc_entry fde_metadata;
> u8|u16|u32 start_addrs[N]; /* can extend to u64 as well */
> u8 sfre_infos[N];
> u8 offsets8[M8];
> u16 offsets16[M16] __aligned(2);
> u32 offsets32[M32] __aligned(4);
> /* we can naturally extend to support also u64 offsets */
> };
>
> i.e., we split all FRE records into their three constituents: start
> addresses, info bytes, and then each FRE can fall into either 8-, 16-,
> or 32-bit offsets "bucket". We collect all the offsets, depending on
> their size, into these aligned offsets{8,16,32} arrays (with natural
> extension to 64 bits, if necessary), with at most wasting 1-3 bytes to
> ensure proper alignment everywhere.
>
> Note, at this point we need to decide if we want to make FREs binary
> searchable or not.
>
> If not, we don't really need anything extra. As we process each
> start_addrs[i] and sfre_infos[i] to find matching FRE, we keep track
> of how many 8-, 16-, and 32-bit offsets already processed FREs
> consumed, and when we find the right one, we know exactly the starting
> index within offset{8,16,32}. Done.
>
> But if we were to make FREs binary searchable, we need to basically
> have an index of offset pointers to quickly find offsetsX[j] position
> corresponding to FRE #i. For that, we can have an extra array right
> next to start_addrs, "semantically parallel" to it:
>
> u8|u16|u32 start_addrs[N];
> u8|u16|u32 offset_idxs[N];
>
> where start_addrs[i] corresponds to offset_idxs[i], and offset_idxs[i]
> points to the first offset corresponding to FRE #i in offsetX[] array
> (depending on FRE's "bitness"). This is a bit more storage for this
> offset index, but for FDEs with lots of FREs this might be a
> worthwhile tradeoff.
>
> Few points:
> a) we can decide this "binary searchability" per-FDE, and for FDEs
> with 1-2-3 FREs not bother, while those with more FREs would be
> searchable ones with index. So we can combine both fast lookups,
> natural alignment of on-disk format, and compactness. The presence of
> index is just another bit in FDE metadata.
I have been going back and forth on this one. So there seem to be the
following options here:
#1. Make "binary searchability" a per-FDE decision.
#2. Make "binary searchability" a per-section decision (I expect
aarch64 to have very low number of FREs per FDE).
#3. Bake "binary searchability" into the SFrame FRE specification.
So its always ON for all FDEs. The advantage is that it makes stack
tracers simpler to implement with less code.
I do think #2, #3 appear simpler in concept.
> b) bitness of offset_idxs[] can be coupled with bitness of
> start_addrs (for simplicity), or could be completely independent and
> identified by FDE's metadata (2 more bits to define this just like
> start_addr bitness is defined). Independent probably would be my
> preference, with linker (or whoever will be producing .sframe data)
> can pick the smallest bitness that is sufficient to represent
> everything.
>
ATM, GAS does apply special logic to decide the bitness of start_addrs
per function, and ld just uses that info. Coupling the bitness of
offset_idx with bitness of start_addrs will be easy (or _easier_ I
think), but for now, I leave it as "should be doable" :)
> Yes, it's a bit more complicated to draw and explain, but everything
> will be nicely aligned, extensible to 64 bits, and (optionally at
> least) binary searchable. Implementation-wise on the kernel side it
> shouldn't be significantly more involved. Maybe the compiler would
> need to be a bit smarter when producing FDE data, but it's no rocket
> science.
>
> Thoughts?
Combining the requirements from your email and Josh's follow up:
- No unaligned accesses
- Sorted FREs
I would put compaction as a "good to have" requirement. It appears to
me that any compaction will mean a sort of post-processing which will
interfere with JIT usecase.
Powered by blists - more mailing lists