lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87wm4bw31t.fsf@oracle.com>
Date: Fri, 31 Oct 2025 15:37:02 +0100
From: "Jose E. Marchesi" <jose.marchesi@...cle.com>
To: Fangrui Song <maskray@...rceware.org>
Cc: Peter Zijlstra <peterz@...radead.org>, <linux-toolchains@...r.kernel.org>,
        <linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: Concerns about SFrame viability for userspace stack walking


> On Thu, Oct 30, 2025 at 10:04 AM Jose E. Marchesi
> <jose.marchesi@...cle.com> wrote:
>>
>>
>> > On Thu, Oct 30, 2025 at 3:26 AM Peter Zijlstra <peterz@...radead.org> wrote:
>> >>
>> >> On Wed, Oct 29, 2025 at 11:53:32PM -0700, Fangrui Song wrote:
>> >> > I've been following the SFrame discussion and wanted to share some
>> >> > concerns about its viability for userspace adoption, based on concrete
>> >> > measurements and comparison with existing compact unwind
>> >> > implementations in LLVM.
>> >> >
>> >> > **Size overhead concerns**
>> >> >
>> >> > Measurements on a x86-64 clang binary show that .sframe (8.87 MiB) is
>> >> > approximately 10% larger than the combined size of .eh_frame and
>> >> > .eh_frame_hdr (8.06 MiB total).  This is problematic because .eh_frame
>> >> > cannot be eliminated - it contains essential information for restoring
>> >> > callee-saved registers, LSDA, and personality information needed for
>> >> > debugging (e.g. reading local variables in a coredump) and C++
>> >> > exception handling.
>> >> >
>> >> > This means adopting SFrame would result in carrying both formats, with
>> >> > a large net size increase.
>> >>
>> >> So the SFrame unwinder is fairly simple code, but what does an .eh_frame
>> >> unwinder look like? Having read most of the links in your email, there
>> >> seem to be references to DWARF byte code interpreters and stuff like
>> >> that.
>> >>
>> >> So while the format compactness is one aspect, the thing I find no
>> >> mention of, is the unwinder complexity.
>> >>
>> >> There have been a number of attempts to do DWARF unwinding in
>> >> kernel space and while I think some architecture do it, x86_64 has had
>> >> very bad experiences with it. At some point I think Linus just said no
>> >> more, no DWARF, not ever.
>> >>
>> >> So from a situation where compilers were generating bad CFI unwind
>> >> information, a horribly complex unwinder that could crash the kernel
>> >> harder than the thing it was reporting on and manual CFI annotations in
>> >> assembly that were never quite right, objtool and ORC were born.
>> >>
>> >> The win was many:
>> >>
>> >>  - simple robust unwinder
>> >>  - no manual CFI annotations that could be wrong
>> >>  - no reliance on compilers that would get it wrong
>> >>
>> >> and I think this is where SFrame came from. I don't think the x86_64
>> >> Linux kernel will ever natively adopt SFrame, ORC works really well for
>> >> us.
>> >>
>> >> However, we do need something to unwind userspace. And yes, personally
>> >> I'm in the frame-pointer camp, that's always worked well for me.
>> >> Distro's however don't seem to like it much, which means that every time
>> >> I do have to profile something userspace, I get to rebuild all the
>> >> relevant code with framepointers on (which is not hard, but tedious).
>> >>
>> >> Barring that, we need something for which the unwind code is simple and
>> >> robust -- and I *think* this has disqualified .eh_frame and full on
>> >> DWARF.
>> >>
>> >> And this is again where SFrame comes in -- its unwinder is simple,
>> >> something we can run in kernel space.
>> >>
>> >> I really don't much care for the particulars, and frame pointers work
>> >> for me -- but I do care about the kernel unwinder code. It had better be
>> >> simple and robvst.
>> >>
>> >> So if you want us to use .eh_frame, great, show us a simple and robust
>> >> unwinder.
>> >
>> > Hi Peter,
>> >
>> > Thanks for this perspective—the unwinder complexity concern is
>> > absolutely valid and critical for kernel use.
>> > To clarify my motivation: I've seen attempts to use SFrame for
>> > userspace adoption
>> > (https://fedoraproject.org/wiki/Changes/SFrameInBinaries ), and I
>> > believe it's not viable for that purpose given the size overhead I
>> > documented. My concerns are primarily about userspace adoption, not
>> > the kernel's internal unwinding.
>> >
>> > If SFrame is exclusively a kernel-space feature, it could be
>> > implemented entirely within objtool – similar to how objtool --link
>> > --orc generates ORC info for vmlinux.o. This approach would eliminate
>> > the need for any modifications to assemblers and linkers, while
>> > allowing SFrame to evolve in any incompatible way.
>> >
>> > For userspace, we could instead modify assemblers and linkers to
>> > support a more compact format or an extension to .eh_frame , but it
>> > won't be SFrame (all of Apple’s compact unwind, ARM EHABI’s
>> > exidx/extab, and Microsoft’s pdata/xdata can implement C++ exception
>> > handling , while SFrame can't, leading to a huge missed opportunity.)
>>
>> The purpose of SFrame is not to be a more compact replacement for
>> .eh_frame.  It is intended to be used to walk stacks, not to unwind
>> them.
>
> Hi Jose,
>
> Let me clarify my concerns, as I think we may be talking past each
> other a bit.

Indeed, and thanks for following up :)

> **The primary concern: size overhead for userspace**
>
> The fundamental issue is that SFrame, as currently designed, results
> in a significant net size increase for userspace binaries because it
> is large and cannot replace .eh_frame (which would mean losing
> debugging and C++ exception handling support).The median .eh_frame
> size across executables and shared libraries on a Linux system is 5+%
> of total VM size:
>
> https://gist.github.com/MaskRay/5995d10b65e1e18b82931c5a8d97f55e
>
> Increasing this to 10% by adding SFrame on top is simply not viable.
> As my reply to Peter mentioned, "If SFrame is exclusively a
> kernel-space feature, it could be implemented entirely within
> objtool—similar to how objtool --link --orc generates ORC info for
> vmlinux.o."

I understand your concern, but whether the size overhead introduced by
SFrame is "viable" or not, I would say that is up to the users to
decide, not us tools engineers.  If someone wants to trade a 5% increase
in size (or whatever amount, really) for improved traceability and/or
performance, we are not going to convince them otherwise, especially if
we cannot provide a working alternative that would give them a better
tradeoff.

> **What about kernel use?**
>
> As I mentioned in my reply to Peter, if SFrame is exclusively a
> kernel-space feature, it could be implemented entirely within
> objtool—similar to how objtool --link --orc generates ORC info for
> vmlinux.o.
> I believe SFrame has a size advantage over ORC, which could make it
> attractive for this use case.
>
> However, if SFrame will not replace the existing in-kernel ORC
> unwinder (as Peter suggested), then I'm afraid SFrame doesn't have a
> clear position—neither for vmlinux nor for userspace programs.
>
> **On the ELF format issues**
>
> https://groups.google.com/g/generic-abi/c/3ZMVJDF79g8
>
> The current Binutils implementation disregards ELF and linker
> conventions, which is a serious concern for all linker maintainers.

Sorry, but binutils doesn't disregard anything it wasn't disregarding
before implementing SFrame, and certainly nothing that lld doesn't
currently disregard as well, in the sense both linkers support other
formats that require linker awareness for meaningful merging.

> The proposed SHF_OS_NONCONFORMING_DISCARD flag has faced strong
> objections in the generic ABI discussion:
> https://groups.google.com/g/generic-abi/c/3ZMVJDF79g8

I would be disappointed otherwise: it is their job to resist change, as
it is the job of everyone else to push for it whenever they feel is
necessary.  Don't get dishearted, it is just a single flag what is being
proposed, that doesn't involve any sort of elaborated semantics and that
is a clear logical complement to an existing flag.

So I remain optimistic, but given there are only a bunch of ELF linkers
around, IMO this flag proposal is more hygienic in nature than anything
else and its absence is hardly a showstopper.  It is clearly better to
have "if (section_is_unknown && nonconforming_discard) {
discard_input_section }" than "if (sframe && i_dont_support_sframe) {
discard_input_section }" in a few places, but if it turns out we can't
have it, well.. it isn't the end of the world.

> There are also unresolved garbage collection issues. I had to disable
> -Wl,--gc-sections entirely when testing for
> https://maskray.me/blog/2025-10-26-stack-walking-space-and-time-trade-offs
> I want to emphasize: custom merging rules do not inherently conflict
> with using proper multi-section structure with section group and
> SHF_LINK_ORDER.

As I already pointed out in my previous reply, I think a solution was
found for that and it is being worked out.

> The format could be designed to work within established ELF
> conventions rather than requiring special cases throughout the linker.

Would you _please_ consider helping them to do so?  I believe there is
still time to get changes into V3, so if you have suggestions for
improving SFrame in that regard, other than offloading complexity to
clients or post-processing tools, or throwing the whole thing out with
the bath water, by all means please reach out to them.

> The concern about maintenance burden isn't about the initial
> implementation—it's about committing to long-term support for a format
> that requires custom handling in every linker while providing
> questionable benefit for its stated use case.

Thanks for explaining.

So you are saying that the question of whether including SFrame support
in lld boils down to the questionable benefit of SFrame for its stated
use case, the primary concern there being the size overhead.  Yes?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ