[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h5vg5tvj.fsf@oracle.com>
Date: Thu, 30 Oct 2025 15:47:28 +0100
From: "Jose E. Marchesi" <jose.marchesi@...cle.com>
To: Fangrui Song <maskray@...rceware.org>
Cc: <linux-toolchains@...r.kernel.org>, <linux-perf-users@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: Concerns about SFrame viability for userspace stack walking
Hi Fangrui.
> - https://maskray.me/blog/2025-09-28-remarks-on-sframe (covering
> mandatory index building problems, section group compliance and
> garbage collection issues, and version compatibility challenges)
After reading your blog it seems to me that your main concern is (was?)
that SFrame "violates ELF rules" because it is not amenable to
concatenation (the result of concatenating two "sframes" is not a valid
sframe) and thus it requires specific linker support to merge these
sections.  This support, once in place, will have to be maintained
moving forward, and evolved along with the format.
First, SFrame is not concatenable because its main design goal is to be
simple to _use_ (not necessarily trivial to link) so it provides little
luxuries like a fixed-size header (uoh), it is self-contained, it does
not require an explicit index to be searched efficiently (instead you
just binary search on the data in place), it has no run-time
relocations, etc.  You don't need to allocate memory dynamically to
decode and stack-walking using SFrame, and it has even been proved (by
the parca project people) that it is possible to write actual verifiable
BPF to walk an userspace stack using an internal format that is in
essence identical to SFrame. You may not care about any of that, but the
people wanting to use the format certainly do, and thats the reason
SFrame (and ORC for that matter) is the way it is.  Sure, nobody would
object making SFrame concatenable to make your (and mine, incidentaly)
life easier, but not at the cost of burdening users with extra
complexity that they dont need and are not willing to assume: why would
they?  We are not even quite sure if such a thing is achievable in this
case: you either put the complexity in the linker, or on the users, but
you cannot make it magically disappear.  If you have some _concrete_
suggestion on improving the format, please by all means let the SFrame
people know, or consider following-up in threads like [1] where these
details are being discussed.  That would be helpful indeed.
Second, as bizarre as it may be, having non-concatenable data in an ELF
section only "violates ELF rules" if the linker _doesn't know_ about the
type of the section containing it.  So the solution is obvious: make
your linker aware of SFrame sections and, voilĂ , the ELF violation goes
away.  You can either merge the SFrame data, or just discard the input
sections, or call the Linker Police.  Just do _something_ about it,
because doing nothing leads to emitting nonsense, and that pisses off
everyone.
Third, this "problem" is not privative to SFrame.  Other formats like EH
Frame also require some degree of linker awareness, in the form of the
generation of an explicit index, or merging, or whatever...  apparenlty
to nobody's scandal, lucky them.  Pushing the burden of dealing with
this to users or to post-processing tools, like you suggest in your
blog, is IMO hardly a satisfactory solution: it is rather a no-solution
and an attempt of making your problem everyone's problem. Now, if you
then move the goalpost and claim the problem is the _degree_ of linker
involvement, as you seem to suggest in your blog, then again your
feedback is very welcome to make SFrame more linker-friendly, as long as
it is in the form of concrete construtive suggestions _and_ not at the
cost of the user's requirements for the format.
Fourth, some people think that it is unreasonable to expect all the ELF
linkers in existence to be aware of SFrame sections (not me; you can
count them all linkers with fingers and no toes).  ELF already supports
a standard section flag SHF_OS_NONCONFORMAT that tries to deal with
cases like this... and fails miserably: unknown sections marked with
that flag are not required to be amenable to concatenation, but the
problem is that upon encountering them the linker is expected to abort
the link with an error.  This is hardly convenient for anyone, so the
SFrame people are currently proposing to the gABI [2] the addition of a
new flag SHF_OS_NONCONFORMANT_DISCARD that would make the linker to just
discard the unknown input section rather than aborting the link.
Fifth, the problems related to GC and section grouping were discussed
during Cauldron [3] and I believe a solution has been already found,
proposed independently by Roland in the gabi discussion thread.  I think
that solution is being written down and will have to be reviewed before
being used in SFrame V3.  Your help on that review would be also very
much appreciated, considering your vast experience on these matters.  I
suppose it will happen in the binutils list.  I am sure they will CC you
in the relevant thread so you won't miss it.
Sixth, several people have repeatedly pointed out that it is not
reasonable to extrapolate the big gap between SFrame V2 and V3 to future
revisions of the format, and that not implementing V2 in lld is
perfectly ok, because the kernel will start directly with V3.  The
SFrame maintainer has assured, also repeteadly, that she is well aware
that any change in the spec will have to be very carefully considered to
avoid or minimize any impact in the linkers. I would say the fact she
also maintains the SFrame support in ld may also serve as bail to
guarantee her good behavior, at least to some extent ;)
Finally, it is not clear to me at all why supporting SFrame would result
in such an unbearable burden to lld's maintenance, as you seem to
expect, given everyhing is fine on the GNU side.  Please don't
misunderstand me: you know your linker better than anyone else and I am
sure there are good reasons that justify such apprehension.  What is it?
Is it lack of contributors?  Is the lld codebase particularly difficult
to maintain or extend?  Looking at [4] the people that are contributing
the SFrame support to LLVM are also volunteering to maintain it moving
forward.. isn't that enough?  Perhaps you need a co-maintainer?  Can we,
or anyone else, help somehow?  If so, how?
Salud!
[1] https://sourceware.org/pipermail/binutils/2025-October/145086.html
[2] https://groups.google.com/g/generic-abi/c/3ZMVJDF79g8
[3] https://www.youtube.com/watch?v=L2UmAp39xqk
[4] https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900
Powered by blists - more mailing lists
 
