[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240721133633.47721-15-lasse.collin@tukaani.org>
Date: Sun, 21 Jul 2024 16:36:29 +0300
From: Lasse Collin <lasse.collin@...aani.org>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Lasse Collin <lasse.collin@...aani.org>,
Sam James <sam@...too.org>,
linux-kernel@...r.kernel.org,
Simon Glass <sjg@...omium.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>,
Jubin Zhong <zhongjubin@...wei.com>,
Jules Maselbas <jmaselbas@...v.net>,
linux-arm-kernel@...ts.infradead.org,
linux-riscv@...ts.infradead.org
Subject: [PATCH v2 14/16] xz: Adjust arch-specific options for better kernel compression
Use LZMA2 options that match the arch-specific alignment of instructions.
This change reduces compressed kernel size 0-2 % depending on the arch.
On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs
it helps the most.
Use the ARM-Thumb filter for ARM-Thumb2 kernels. This reduces compressed
kernel size about 5 %.[1] Previously such kernels were compressed using
the ARM filter which didn't do anything useful with ARM-Thumb2 code.
Add BCJ filter support for ARM64 and RISC-V. Compared to unfiltered XZ
or plain LZMA, the compressed kernel size is reduced about 5 % on ARM64
and 7 % on RISC-V. A new enough version of the xz tool is required: 5.4.0
for ARM64 and 5.6.0 for RISC-V. With an old xz version, a message is
printed to standard error and the kernel is compressed without the filter.
Update lib/decompress_unxz.c to match the changes to xz_wrap.sh.
Update the CONFIG_KERNEL_XZ help text in init/Kconfig:
- Add the RISC-V and ARM64 filters.
- Clarify that the PowerPC filter is for big endian only.
- Omit IA-64.
Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1]
Cc: Simon Glass <sjg@...omium.org>
Cc: Catalin Marinas <catalin.marinas@....com>
Cc: Will Deacon <will@...nel.org>
Cc: Paul Walmsley <paul.walmsley@...ive.com>
Cc: Palmer Dabbelt <palmer@...belt.com>
Cc: Albert Ou <aou@...s.berkeley.edu>
Cc: Jubin Zhong <zhongjubin@...wei.com>
Cc: Jules Maselbas <jmaselbas@...v.net>
Cc: linux-arm-kernel@...ts.infradead.org
Cc: linux-riscv@...ts.infradead.org
Reviewed-by: Sam James <sam@...too.org>
Signed-off-by: Lasse Collin <lasse.collin@...aani.org>
---
Notes:
v2: Avoid the "eval" command. The use of "eval" on xz's output might
scare people so this should make the script less scary. This should
address the concerns in [2]. See also my replies [3] and [4].
[2]: https://lore.kernel.org/lkml/27db456edeb6f72e7e229c2333c5d8449718c26e.camel@16bits.net/
[3]: https://lore.kernel.org/lkml/20240403225903.0773746d@kaneli/
[4]: https://lore.kernel.org/lkml/20240404170103.1bc382b3@kaneli/
init/Kconfig | 5 +-
lib/decompress_unxz.c | 14 ++++-
scripts/xz_wrap.sh | 142 ++++++++++++++++++++++++++++++++++++++++--
3 files changed, 152 insertions(+), 9 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index 964355d1757e..236105e4d441 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -310,8 +310,9 @@ config KERNEL_XZ
BCJ filters which can improve compression ratio of executable
code. The size of the kernel is about 30% smaller with XZ in
comparison to gzip. On architectures for which there is a BCJ
- filter (i386, x86_64, ARM, IA-64, PowerPC, and SPARC), XZ
- will create a few percent smaller kernel than plain LZMA.
+ filter (i386, x86_64, ARM, ARM64, RISC-V, big endian PowerPC,
+ and SPARC), XZ will create a few percent smaller kernel than
+ plain LZMA.
The speed is about the same as with LZMA: The decompression
speed of XZ is better than that of bzip2 but worse than gzip
diff --git a/lib/decompress_unxz.c b/lib/decompress_unxz.c
index 46aa3be13fc5..cae00395d7a6 100644
--- a/lib/decompress_unxz.c
+++ b/lib/decompress_unxz.c
@@ -126,11 +126,21 @@
#ifdef CONFIG_X86
# define XZ_DEC_X86
#endif
-#ifdef CONFIG_PPC
+#if defined(CONFIG_PPC) && defined(CONFIG_CPU_BIG_ENDIAN)
# define XZ_DEC_POWERPC
#endif
#ifdef CONFIG_ARM
-# define XZ_DEC_ARM
+# ifdef CONFIG_THUMB2_KERNEL
+# define XZ_DEC_ARMTHUMB
+# else
+# define XZ_DEC_ARM
+# endif
+#endif
+#ifdef CONFIG_ARM64
+# define XZ_DEC_ARM64
+#endif
+#ifdef CONFIG_RISCV
+# define XZ_DEC_RISCV
#endif
#ifdef CONFIG_SPARC
# define XZ_DEC_SPARC
diff --git a/scripts/xz_wrap.sh b/scripts/xz_wrap.sh
index c8c36441ab70..f19369687030 100755
--- a/scripts/xz_wrap.sh
+++ b/scripts/xz_wrap.sh
@@ -6,14 +6,146 @@
#
# Author: Lasse Collin <lasse.collin@...aani.org>
+# This has specialized settings for the following archs. However,
+# XZ-compressed kernel isn't currently supported on every listed arch.
+#
+# Arch Align Notes
+# arm 2/4 ARM and ARM-Thumb2
+# arm64 4
+# csky 2
+# loongarch 4
+# mips 2/4 MicroMIPS is 2-byte aligned
+# parisc 4
+# powerpc 4 Uses its own wrapper for compressors instead of this.
+# riscv 2/4
+# s390 2
+# sh 2
+# sparc 4
+# x86 1
+
+# A few archs use 2-byte or 4-byte aligned instructions depending on
+# the kernel config. This function is used to check if the relevant
+# config option is set to "y".
+is_enabled()
+{
+ grep -q "^$1=y$" include/config/auto.conf
+}
+
+# XZ_VERSION is needed to disable features that aren't available in
+# old XZ Utils versions.
+XZ_VERSION=$($XZ --robot --version) || exit
+XZ_VERSION=$(printf '%s\n' "$XZ_VERSION" | sed -n 's/^XZ_VERSION=//p')
+
+# Assume that no BCJ filter is available.
BCJ=
-LZMA2OPTS=
+# Set the instruction alignment to 1, 2, or 4 bytes.
+#
+# Set the BCJ filter if one is available.
+# It must match the #ifdef usage in lib/decompress_unxz.c.
case $SRCARCH in
- x86) BCJ=--x86 ;;
- powerpc) BCJ=--powerpc ;;
- arm) BCJ=--arm ;;
- sparc) BCJ=--sparc ;;
+ arm)
+ if is_enabled CONFIG_THUMB2_KERNEL; then
+ ALIGN=2
+ BCJ=--armthumb
+ else
+ ALIGN=4
+ BCJ=--arm
+ fi
+ ;;
+
+ arm64)
+ ALIGN=4
+
+ # ARM64 filter was added in XZ Utils 5.4.0.
+ if [ "$XZ_VERSION" -ge 50040002 ]; then
+ BCJ=--arm64
+ else
+ echo "$0: Upgrading to xz >= 5.4.0" \
+ "would enable the ARM64 filter" \
+ "for better compression" >&2
+ fi
+ ;;
+
+ csky)
+ ALIGN=2
+ ;;
+
+ loongarch)
+ ALIGN=4
+ ;;
+
+ mips)
+ if is_enabled CONFIG_CPU_MICROMIPS; then
+ ALIGN=2
+ else
+ ALIGN=4
+ fi
+ ;;
+
+ parisc)
+ ALIGN=4
+ ;;
+
+ powerpc)
+ ALIGN=4
+
+ # The filter is only for big endian instruction encoding.
+ if is_enabled CONFIG_CPU_BIG_ENDIAN; then
+ BCJ=--powerpc
+ fi
+ ;;
+
+ riscv)
+ if is_enabled CONFIG_RISCV_ISA_C; then
+ ALIGN=2
+ else
+ ALIGN=4
+ fi
+
+ # RISC-V filter was added in XZ Utils 5.6.0.
+ if [ "$XZ_VERSION" -ge 50060002 ]; then
+ BCJ=--riscv
+ else
+ echo "$0: Upgrading to xz >= 5.6.0" \
+ "would enable the RISC-V filter" \
+ "for better compression" >&2
+ fi
+ ;;
+
+ s390)
+ ALIGN=2
+ ;;
+
+ sh)
+ ALIGN=2
+ ;;
+
+ sparc)
+ ALIGN=4
+ BCJ=--sparc
+ ;;
+
+ x86)
+ ALIGN=1
+ BCJ=--x86
+ ;;
+
+ *)
+ echo "$0: Arch-specific tuning is missing for '$SRCARCH'" >&2
+
+ # Guess 2-byte-aligned instructions. Guessing too low
+ # should hurt less than guessing too high.
+ ALIGN=2
+ ;;
+esac
+
+# Select the LZMA2 options matching the instruction alignment.
+case $ALIGN in
+ 1) LZMA2OPTS= ;;
+ 2) LZMA2OPTS=lp=1 ;;
+ 4) LZMA2OPTS=lp=2,lc=2 ;;
+ *) echo "$0: ALIGN wrong or missing" >&2; exit 1 ;;
esac
# Use single-threaded mode because it compresses a little better
--
2.45.2
Powered by blists - more mailing lists