lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <58cc1053-7208-4b22-99cb-210fdf700569@app.fastmail.com>
Date: Fri, 08 Mar 2024 14:15:08 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "Yuntao Liu" <liuyuntao12@...wei.com>,
 linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
 "Ard Biesheuvel" <ardb@...nel.org>, "Fangrui Song" <maskray@...gle.com>
Cc: "Russell King" <linux@...linux.org.uk>, "Andrew Davis" <afd@...com>,
 "Andrew Morton" <akpm@...ux-foundation.org>,
 "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 "Geert Uytterhoeven" <geert+renesas@...der.be>,
 "Jonathan Corbet" <corbet@....net>, "Mike Rapoport" <rppt@...nel.org>,
 "Rob Herring" <robh@...nel.org>, "Thomas Gleixner" <tglx@...utronix.de>,
 "Linus Walleij" <linus.walleij@...aro.org>, llvm@...ts.linux.dev
Subject: Re: [PATCH-next v2] arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

On Thu, Mar 7, 2024, at 16:12, Yuntao Liu wrote:
> The current arm32 architecture does not yet support the
> HAVE_LD_DEAD_CODE_DATA_ELIMINATION feature. arm32 is widely used in
> embedded scenarios, and enabling this feature would be beneficial for
> reducing the size of the kernel image.
>
> In order to make this work, we keep the necessary tables by annotating
> them with KEEP, also it requires further changes to linker script to KEEP
> some tables and wildcard compiler generated sections into the right place.
>
> It boots normally with defconfig, vexpress_defconfig and tinyconfig.
>
> The size comparison of zImage is as follows:
> defconfig       vexpress_defconfig      tinyconfig
> 5137712         5138024                 424192          no dce
> 5032560         4997824                 298384          dce
> 2.0%            2.7%                    29.7%           shrink
>
> When using smaller config file, there is a significant reduction in the
> size of the zImage.
>
> We also tested this patch on a commercially available single-board
> computer, and the comparison is as follows:
> a15eb_config
> 2161384         no dce
> 2092240         dce
> 3.2%            shrink
>
> The zImage size has been reduced by approximately 3.2%, which is 70KB on
> 2.1M.
>
> Signed-off-by: Yuntao Liu <liuyuntao12@...wei.com>

I've retested with both gcc-13 and clang-18, and so no
more build issues. Your previous version already worked
fine for me.

I did some tests combining this with CONFIG_TRIM_UNUSED_KSYMS,
which showed a significant improvement as expected. I also
tried combining it with an experimental CONFIG_LTO_CLANG
patch, but that did not show any further improvements.

Tested-by: Arnd Bergmann <arnd@...db.de>
Reviewed-by: Arnd Bergmann <arnd@...db.de>

Adding Ard Biesheuvel and Fangrui Song to Cc, so they can comment
on the ARM_VECTORS_TEXT workaround. I don't understand enough of
the details of what is going on here.

Full quote of the patch below so they can see the whole thing.

If they are also happy with the patch, I think you can send it
into Russell's patch tracker at
https://www.armlinux.org.uk/developer/patches/info.php

> ---
> v2:
>    - Support config XIP_KERNEL.
>    - Support LLVM compilation.
>
> v1: https://lore.kernel.org/all/20240220081527.23408-1-liuyuntao12@huawei.com/
> ---
>  arch/arm/Kconfig                       |  1 +
>  arch/arm/boot/compressed/vmlinux.lds.S |  4 ++--
>  arch/arm/include/asm/vmlinux.lds.h     | 18 +++++++++++++++++-
>  arch/arm/kernel/vmlinux-xip.lds.S      |  8 ++++++--
>  arch/arm/kernel/vmlinux.lds.S          | 10 +++++++---
>  5 files changed, 33 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 0af6709570d1..de78ceb821df 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -113,6 +113,7 @@ config ARM
>  	select HAVE_KERNEL_XZ
>  	select HAVE_KPROBES if !XIP_KERNEL && !CPU_ENDIAN_BE32 && !CPU_V7M
>  	select HAVE_KRETPROBES if HAVE_KPROBES
> +	select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
>  	select HAVE_MOD_ARCH_SPECIFIC
>  	select HAVE_NMI
>  	select HAVE_OPTPROBES if !THUMB2_KERNEL
> diff --git a/arch/arm/boot/compressed/vmlinux.lds.S 
> b/arch/arm/boot/compressed/vmlinux.lds.S
> index 3fcb3e62dc56..da21244aa892 100644
> --- a/arch/arm/boot/compressed/vmlinux.lds.S
> +++ b/arch/arm/boot/compressed/vmlinux.lds.S
> @@ -89,7 +89,7 @@ SECTIONS
>       * The EFI stub always executes from RAM, and runs strictly before 
> the
>       * decompressor, so we can make an exception for its r/w data, and 
> keep it
>       */
> -    *(.data.efistub .bss.efistub)
> +    *(.data.* .bss.*)
>      __pecoff_data_end = .;
> 
>      /*
> @@ -125,7 +125,7 @@ SECTIONS
> 
>    . = BSS_START;
>    __bss_start = .;
> -  .bss			: { *(.bss) }
> +  .bss			: { *(.bss .bss.*) }
>    _end = .;
> 
>    . = ALIGN(8);		/* the stack must be 64-bit aligned */
> diff --git a/arch/arm/include/asm/vmlinux.lds.h 
> b/arch/arm/include/asm/vmlinux.lds.h
> index 4c8632d5c432..dfe2b6ad6b51 100644
> --- a/arch/arm/include/asm/vmlinux.lds.h
> +++ b/arch/arm/include/asm/vmlinux.lds.h
> @@ -42,7 +42,7 @@
>  #define PROC_INFO							\
>  		. = ALIGN(4);						\
>  		__proc_info_begin = .;					\
> -		*(.proc.info.init)					\
> +		KEEP(*(.proc.info.init))				\
>  		__proc_info_end = .;
> 
>  #define IDMAP_TEXT							\
> @@ -87,6 +87,22 @@
>  		*(.vfp11_veneer)                                        \
>  		*(.v4_bx)
> 
> +/*
> +When CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is enabled, it is important 
> to
> +annotate .vectors sections with KEEP. While linking with ld, it is
> +acceptable to directly use KEEP with .vectors sections in ARM_VECTORS.
> +However, when using ld.lld for linking, KEEP is not recognized within 
> the
> +OVERLAY command; it is treated as a regular string. Hence, it is 
> advisable
> +to define a distinct section here that explicitly retains the .vectors
> +sections when CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is turned on.
> +*/
> +#define ARM_VECTORS_TEXT						\
> +	.vectors.text : {						\
> +		KEEP(*(.vectors))					\
> +		KEEP(*(.vectors.bhb.loop8))				\
> +		KEEP(*(.vectors.bhb.bpiall))				\
> +       }
> +
>  #define ARM_TEXT							\
>  		IDMAP_TEXT						\
>  		__entry_text_start = .;					\
> diff --git a/arch/arm/kernel/vmlinux-xip.lds.S 
> b/arch/arm/kernel/vmlinux-xip.lds.S
> index c16d196b5aad..035fa18060b3 100644
> --- a/arch/arm/kernel/vmlinux-xip.lds.S
> +++ b/arch/arm/kernel/vmlinux-xip.lds.S
> @@ -63,7 +63,7 @@ SECTIONS
>  	. = ALIGN(4);
>  	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
>  		__start___ex_table = .;
> -		ARM_MMU_KEEP(*(__ex_table))
> +		ARM_MMU_KEEP(KEEP(*(__ex_table)))
>  		__stop___ex_table = .;
>  	}
> 
> @@ -83,7 +83,7 @@ SECTIONS
>  	}
>  	.init.arch.info : {
>  		__arch_info_begin = .;
> -		*(.arch.info.init)
> +		KEEP(*(.arch.info.init))
>  		__arch_info_end = .;
>  	}
>  	.init.tagtable : {
> @@ -135,6 +135,10 @@ SECTIONS
>  	ARM_TCM
>  #endif
> 
> +#ifdef LD_DEAD_CODE_DATA_ELIMINATION
> +	ARM_VECTORS_TEXT
> +#endif
> +
>  	/*
>  	 * End of copied data. We need a dummy section to get its LMA.
>  	 * Also located before final ALIGN() as trailing padding is not stored
> diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
> index bd9127c4b451..2cfb890c93fb 100644
> --- a/arch/arm/kernel/vmlinux.lds.S
> +++ b/arch/arm/kernel/vmlinux.lds.S
> @@ -74,7 +74,7 @@ SECTIONS
>  	. = ALIGN(4);
>  	__ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) {
>  		__start___ex_table = .;
> -		ARM_MMU_KEEP(*(__ex_table))
> +		ARM_MMU_KEEP(KEEP(*(__ex_table)))
>  		__stop___ex_table = .;
>  	}
> 
> @@ -99,7 +99,7 @@ SECTIONS
>  	}
>  	.init.arch.info : {
>  		__arch_info_begin = .;
> -		*(.arch.info.init)
> +		KEEP(*(.arch.info.init))
>  		__arch_info_end = .;
>  	}
>  	.init.tagtable : {
> @@ -116,7 +116,7 @@ SECTIONS
>  #endif
>  	.init.pv_table : {
>  		__pv_table_begin = .;
> -		*(.pv_table)
> +		KEEP(*(.pv_table))
>  		__pv_table_end = .;
>  	}

I previously asked about this bit, since it appeared that this
might prevent a lot of code from being discarded when
CONFIG_ARM_PATCH_PHYS_VIRT is set. I tested this again now,
and found this makes very little practical difference, so
it's all good.

> @@ -134,6 +134,10 @@ SECTIONS
>  	ARM_TCM
>  #endif
> 
> +#ifdef LD_DEAD_CODE_DATA_ELIMINATION
> +	ARM_VECTORS_TEXT
> +#endif
> +
>  #ifdef CONFIG_STRICT_KERNEL_RWX
>  	. = ALIGN(1<<SECTION_SHIFT);
>  #else

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ