[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1714337938.319508.1750244108368@privateemail.com>
Date: Wed, 18 Jun 2025 12:55:08 +0200 (CEST)
From: Marco Bonelli <marco@...eim.net>
To: "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"paul.walmsley@...ive.com" <paul.walmsley@...ive.com>,
"aou@...s.berkeley.edu" <aou@...s.berkeley.edu>,
"palmer@...belt.com" <palmer@...belt.com>,
"alex@...ti.fr" <alex@...ti.fr>
Subject: RISC-V 32-bit debug builds reaching breaking point: too many
symbols
RISC-V debug builds generate *millions* of symbols in vmlinux.o, and this number
is now getting beyond the limit of the maximum symbol table index that can be
represented by 32-bit Rela relocations. The ELF32_R_SYM portion of Elf32_Rela
r_info is only 24 bits, therefore symtab indexes larger than 16777215 overflow
and cause bogus Rela entries pointing to wrong symbols, ultimately breaking the
build.
I recently noticed this [1] when "MODPOST vmlinux.symvers" failed with thousands
of errors and warnings on a particular build of mine for v6.15 RISC-V 32-bit.
The majority (99%) of the symbols are for local temporary labels (.Lxxx) that
would normally be stripped by default at link time, but seems they cannot be
stripped as they are referenced by Rela relocations. The number of such symbols
has always been huge from the very first RISC-V Linux version (v4.15) and has
been steadily growing since (I plotted a bar chart for v6.5-v6.15 here [3]).
For reference, on v4.15 a simple defconfig + debug build with GCC 11.1 produces
a vmlinux.o with around 7 million such symbols. On v6.15 with GCC 14.2 it gets
to around 15 million. We are close to the 16.8M limit, and already exceeding it
on some configurations (the discussion in [1] is an example).
What can be done to reduce them down to an acceptable number, or even better get
rid of them entirely? These local temporary symbols referenced by relocations
seem to be ubiquitous, so I suppose this is some RISC-V ELF ABI design choice.
Could those Rela relocs just avoid referencing any symbol? Or could Rel/Relr be
used instead?
Otherwise, if those really need to be kept as is, perhaps splitting debug info
out of vmlinux.o before linking into final vmlinux (when those are finally
stripped) could also be a viable solution, though the debug info would have to
be split into multiple files or the same problem would arise.
Thoughts?
Here are some stats from a custom script I hacked together (can provide source
if needed).
# Clean v6.15 tree
export PATH=/path/to/gcc-14.2.0-nolibc/riscv32-linux/bin:$PATH
export ARCH=riscv CROSS_COMPILE=riscv32-linux-
make defconfig
make 32-bit.config
./scripts/config -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
make olddefconfig
make -j vmlinux
Stats:
Total symbols: 15138556
Temporary local symbols: 14972044
Referenced by relocations: 14971527
By section:
.rela.debug_info 8783573
.rela.debug_line 3649412
.rela.debug_ranges 1110915
.rela.debug_loc 972349
.rela.text 398789
.rela.debug_frame 283422
.rela.rodata 17182
.rela.init.text 15732
.rela__bug_table 11163
.rela__jump_table 10830
.rela.debug_aranges 10253
.rela.alternative 4510
.rela.data 3504
.rela.text.unlikely 2650
.rela__ex_table 2044
.rela.sched.text 1405
.rela.init.rodata 541
.rela.exit.text 538
.rela.ref.text 344
.rela.init.data 321
.rela.noinstr.text 234
.rela.spinlock.text 194
.rela.data..ro_after_init 98
.rela.cpuidle.text 36
.rela.srodata 30
.rela__modver 27
.rela.head.text 21
.rela.sdata 15
.rela.irqentry.text 12
.relaruntime_shift_d_hash_shift 10
.relaruntime_ptr_dentry_hashtable 10
.rela.lsm_info.init 3
.rela.data..percpu 1
By relocation kind:
R_RISCV_32 9665623
R_RISCV_SUB16 3652043
R_RISCV_ADD16 3647245
R_RISCV_ADD32 1174187
R_RISCV_SUB32 345881
R_RISCV_BRANCH 168074
R_RISCV_RVC_BRANCH 104031
R_RISCV_SET6 87790
R_RISCV_SUB6 87790
R_RISCV_PCREL_LO12_I 81819
R_RISCV_RVC_JUMP 76539
R_RISCV_SET8 29260
R_RISCV_SUB8 29260
R_RISCV_PCREL_HI20 23416
R_RISCV_SET16 4798
R_RISCV_JAL 4676
R_RISCV_PCREL_LO12_S 1410
R_RISCV_CALL_PLT 1
[1]: https://lore.kernel.org/lkml/960240908.630790.1748641210849@privateemail.com/
[2]: https://x.com/mebeim/status/1934950596693635410
--
Marco Bonelli
Powered by blists - more mailing lists