[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7m8i8YC7Mltqcpz@gmail.com>
Date: Sat, 22 Feb 2025 13:01:15 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Ard Biesheuvel <ardb+git@...gle.com>, linux-kernel@...r.kernel.org,
x86@...nel.org, Tom Lendacky <thomas.lendacky@....com>,
Nathan Chancellor <nathan@...nel.org>
Subject: Re: [RFC PATCH 1/2] x86/relocs: Improve diagnostic for rejected
absolute references
* Ard Biesheuvel <ardb@...nel.org> wrote:
> On Mon, 3 Feb 2025 at 10:40, Ingo Molnar <mingo@...nel.org> wrote:
> >
> >
> > * Ard Biesheuvel <ardb@...nel.org> wrote:
> >
> > > On Mon, 27 Jan 2025 at 17:57, Linus Torvalds
> > > <torvalds@...ux-foundation.org> wrote:
> > > >
> > > > On Mon, 27 Jan 2025 at 03:43, Ard Biesheuvel <ardb+git@...gle.com> wrote:
> > > > >
> > > > > Absolute reference to symbol '.rodata+0x180' detected in .head.text (0xffffffff820cb4ba).
> > > >
> > > > Do we have any symbol name lookup logic anywhere?
> > > >
> > >
> > > I can look into that. In this particular case, though, there is no
> > > symbol to look up as it is a anonymous jump table generated by the
> > > compiler. And the function name would be inaccurate too, as
> > > snp_cpuid_postprocess() got inlined into its caller. But I guess with
> > > the right DWARF data, at least the call site could be narrowed down a
> > > bit better.
> >
> > So patch #2 is now upstream, but should I apply this diagnostic patch
> > as-is, or will there be a -v2?
> >
>
> I'm looking into this. But give the points above, I'm reaching the
> conclusion that producing a better diagnostic based solely on vmlinux
> (which may be built without debug info) is intractable, and not even
> the DWARF metadata will describe a compiler generated jump table using
> a named ELF symbol.
>
> So I am also looking into isolating the startup code like I did for
> arm64 (and which has been adopted by RISC-V as well), but this is
> rather hairy on x86 so it will take some time. But once that lands,
> this diagnostic can be removed.
>
> So I will leave it up to you to decide whether to merge this
> improvement for now, or revert the diagnostic as you suggested before.
> This code has already identified some issues that were subsequently
> fixed, so it has already served its purpose.
So after another 2 weeks there's been no new upstream regressions I'm
aware of, so - knock on wood - it seems we can leave the die() in
place?
But could we perhaps make it more debuggable, should it trigger - such
as not removing the relevant object file and improving the message?
I.e. make the build failure experience Linus had somewhat more
palatable...
Thanks,
Ingo
Powered by blists - more mailing lists