linux-kernel - Re: [PATCH 2/2] kallsyms: build faster by using .incbin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAK7LNAR059LrBgvZVfapTGtU_VrHhHdrk1XfCbACPe-7109UiQ@mail.gmail.com>
Date: Fri, 23 Feb 2024 13:26:27 +0900
From: Masahiro Yamada <masahiroy@...nel.org>
To: Jann Horn <jannh@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Nick Desaulniers <ndesaulniers@...gle.com>, 
	Miguel Ojeda <ojeda@...nel.org>, Zhen Lei <thunder.leizhen@...wei.com>, 
	Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org
Subject: Re: [PATCH 2/2] kallsyms: build faster by using .incbin

On Thu, Feb 22, 2024 at 11:21 PM Jann Horn <jannh@...gle.com> wrote:
>
> On Thu, Feb 22, 2024 at 12:20 PM Jann Horn <jannh@...gle.com> wrote:
> > On Thu, Feb 22, 2024 at 5:07 AM Masahiro Yamada <masahiroy@...nel.org> wrote:
> > > On Thu, Feb 22, 2024 at 5:27 AM Jann Horn <jannh@...gle.com> wrote:
> > > >
> > > > Currently, kallsyms builds a big assembly file (~19M with a normal
> > > > kernel config), and then the assembler has to turn that big assembly
> > > > file back into binary data, which takes around a second per kallsyms
> > > > invocation. (Normally there are two kallsyms invocations per build.)
> > > >
> > > > It is much faster to instead directly output binary data, which can
> > > > be imported in an assembly file using ".incbin". This is also the
> > > > approach taken by arch/x86/boot/compressed/mkpiggy.c.
> > >
> > >
> > > Yes, that is a sensible case because it just wraps the binary
> > > without any modification.
> > >
> > >
> > >
> > >
> > > > So this patch switches kallsyms to that approach.
> > > >
> > > > A complication with this is that the endianness of numbers between
> > > > host and target might not match (for example, when cross-compiling);
> > > > and there seems to be no kconfig symbol that tells us what endianness
> > > > the target has.
> > >
> > >
> > >
> > > CONFIG_CPU_BIG_ENDIAN is it.
> > >
> > >
> > >
> > > You could do this:
> > >
> > > if is_enabled CONFIG_CPU_BIG_ENDIAN; then
> > >         kallsymopt="${kallsymopt} --big-endian"
> > > fi
> > >
> > > if is_enabled CONFIG_64BIT; then
> > >         kallsymopt="${kallsymopt} --64bit"
> > > fi
> >
> > Aah, nice, thanks, I searched for endianness kconfig flags but somehow
> > missed that one.
> >
> > Though actually, I think further optimizations might make it necessary
> > to directly operate on ELF files anyway, in which case it would
> > probably be easier to keep using the ELF header...
> >
> > > > So pass the path to the intermediate vmlinux ELF file to the kallsyms
> > > > tool, and let it parse the ELF header to figure out the target's
> > > > endianness.
> > > >
> > > > I have verified that running kallsyms without these changes and
> > > > kallsyms with these changes on the same input System.map results
> > > > in identical object files.
> > > >
> > > > This change reduces the time for an incremental kernel rebuild
> > > > (touch fs/ioctl.c, then re-run make) from 27.7s to 24.1s (medians
> > > > over 16 runs each) on my machine - saving around 3.6 seconds.
> > >
> > >
> > >
> > >
> > > This reverts bea5b74504742f1b51b815bcaf9a70bddbc49ce3
> > >
> > > Somebody might struggle with debugging again, but I am not sure.
> > >
> > > Arnd?
> > >
> > >
> > >
> > > If the effort were "I invented a way to do kallsyms in
> > > one pass instead of three", it would be so much more attractive.
> >
> > Actually, I was chatting with someone about this yesterday, and I
> > think I have an idea on how to get rid of two link steps... I might
> > try out some stuff and then come back with another version of this
> > series afterwards.
>
> I think basically we could change kallsyms so that on the second run,
> it checks if the kallsyms layout is the same as on the first run, and
> if yes, directly overwrite the relevant part of vmlinux. (And adjust
> the relative_base.) That would save us the final link... does that
> sound like a reasonable idea?


I do not know how we can save the final link.

Inserting the kallsyms data into the .rodata section
would change the address of all symbols that come after.
Only the linker can sort out the address change.


>
> I don't really have any good ideas for saving more than that, given
> that we want to squeeze the kallsyms in between the data and bss
> sections, so we can't just append it at the end of vmlinux... we could
> get the symbol list from vmlinux.o instead of linking
> ".tmp_vmlinux.kallsyms1", but the comments in link-vmlinux.sh say that
> extra linker-generated symbols might appear, and I guess we probably
> don't want to miss those...


I knew it was not trivial.
If you do not have an idea, you do not need to change it.




-- 
Best Regards
Masahiro Yamada