lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <404dfe9a-1f4f-4776-863a-d8bbe08335e2@samsung.com>
Date: Wed, 25 Jun 2025 13:41:50 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Arnd Bergmann <arnd@...nel.org>, Alexander Viro
	<viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>
Cc: Arnd Bergmann <arnd@...db.de>, Jan Kara <jack@...e.cz>, Alexander
	Mikhalitsyn <alexander@...alicyn.com>, Jann Horn <jannh@...gle.com>, Luca
	Boccassi <luca.boccassi@...il.com>, Jeff Layton <jlayton@...nel.org>, Roman
	Kisel <romank@...ux.microsoft.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] coredump: reduce stack usage in vfs_coredump()

Hi,

On 20.06.2025 13:21, Arnd Bergmann wrote:
> From: Arnd Bergmann <arnd@...db.de>
>
> The newly added socket coredump code runs into some corner cases
> with KASAN that end up needing a lot of stack space:
>
> fs/coredump.c:1206:1: error: the frame size of 1680 bytes is larger than 1280 bytes [-Werror=frame-larger-than=]
>
> Mark the socket helper function as noinline_for_stack so its stack
> usage does not leak out to the other code paths. This also seems to
> help with register pressure, and the resulting combined stack usage of
> vfs_coredump() and coredump_socket() is actually lower than the inlined
> version.
>
> Moving the core_state variable into coredump_wait() helps reduce the
> stack usage further and simplifies the code, though it is not sufficient
> to avoid the warning by itself.
>
> Fixes: 6a7a50e5f1ac ("coredump: use a single helper for the socket")
> Signed-off-by: Arnd Bergmann <arnd@...db.de>

This change appears in today's linux-next (next-20250625) as commit 
fb82645d3f72 ("coredump: reduce stack usage in vfs_coredump()"). In my 
tests I found that it causes a kernel oops on some of my ARM 32bit 
Exynos based boards. This is really strange, because I don't see any 
obvious problem in this patch. Reverting $subject on top of linux-next 
hides/fixes the oops. I suspect some kind of use-after-free issue, but I 
cannot point anything related. Here is the kernel log from one of the 
affected boards (I've intentionally kept the register and stack dumps):

8<--- cut here ---
Unable to handle kernel paging request at virtual address c10ba370 when 
write
[c10ba370] *pgd=4101941e(bad)
Internal error: Oops: 80d [#1] SMP ARM
Modules linked in: cmac bnep mwifiex_sdio mwifiex btmrvl_sdio btmrvl 
sha256 libsha256_generic bluetooth cfg80211 exynos_gsc v4l2_mem2mem 
s5p_mfc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 
videobuf2_common ecdh_generic ecc videodev mc s5p_cec
CPU: 1 UID: 0 PID: 1367 Comm: cgm-release-age Not tainted 
6.16.0-rc3-next-20250625 #10627 PREEMPT
Hardware name: Samsung Exynos (Flattened Device Tree)
PC is at vfs_coredump+0x294/0x17bc
LR is at _raw_spin_unlock_irq+0x20/0x50
pc : [<c03c4534>]    lr : [<c0cf096c>]    psr: a0000013
sp : f1249dd0  ip : 00000000  fp : f1249e68
r10: c1ae41e4  r9 : 00000000  r8 : c13aaa74
r7 : 00000000  r6 : 63726f46  r5 : c2a3c000  r4 : c3852080
r3 : c10ba370  r2 : c3852080  r1 : ffffffff  r0 : 00006325
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 4324c06a  DAC: 00000051
Register r0 information: non-paged memory
Register r1 information: non-paged memory
Register r2 information: slab task_struct start c3852080 pointer offset 
0 size 4160
Register r3 information: non-slab/vmalloc memory
Register r4 information: slab task_struct start c3852080 pointer offset 
0 size 4160
Register r5 information: slab kmalloc-128 start c2a3c000 pointer offset 
0 size 128
Register r6 information: non-paged memory
Register r7 information: NULL pointer
Register r8 information: non-slab/vmalloc memory
Register r9 information: NULL pointer
Register r10 information: non-slab/vmalloc memory
Register r11 information: 2-page vmalloc region starting at 0xf1248000 
allocated at kernel_clone+0x58/0x3f4
Register r12 information: NULL pointer
Process cgm-release-age (pid: 1367, stack limit = 0x4909c75e)
Stack: (0xf1249dd0 to 0xf124a000)
9dc0:                                     c3852830 00000000 c3852080 
c01ae7fc
9de0: c13095a8 c45c8600 c2a3c000 00000000 00000000 c014ab28 c102c84c 
c104be90
9e00: c13095a8 c1496b58 ffffffff 00000081 00000000 6fda5e52 c127e2ec 
00000000
9e20: c2a3c280 00000004 00000080 00000000 c3852800 00000001 00000001 
00000000
9e40: c2855d90 00000000 c3852830 c0ce2874 c13133c8 c3852080 c13133e0 
c1472911
9e60: c1cc6480 ef0484e0 f1249f60 00000000 00000000 800000cd 00000001 
00000000
9e80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
00000000
9ea0: 00000000 00000000 00000006 c01450a8 f1249f4c 00000001 c12849dc 
6fda5e52
9ec0: c0cf096c c3852080 00000005 00000006 400004d8 0001e000 c1309154 
c2855d80
9ee0: f1249f60 c014ab28 00000000 c02eeb48 00000000 c3852830 00000000 
ce4d1c00
9f00: f1249f4c c13095a8 c127e2ec c2855d90 00000000 6fda5e52 c12849dc 
c3852080
9f20: f1249fb0 00000000 f1249f4c 5ac3c35a b6d21668 c3852730 b6d2166c 
c011a430
9f40: 00000000 00000002 f1249f74 c0cf096c bee9b9d4 c0148328 f1249f7c 
00000000
9f60: 00000006 00000000 fffffffa 00000557 00000000 00000000 00000000 
00000000
9f80: 00000000 6fda5e52 00000000 00000000 b6f43968 bee9b9d4 000000af 
c0100290
9fa0: c3852080 000000af 00000000 c0100088 00000000 bee9b9d4 00000000 
00000008
9fc0: 00000000 b6f43968 bee9b9d4 000000af bee9bf70 b6ebdc94 00440f90 
00000000
9fe0: 00000020 bee9b9d0 ffffffff b6d2166c 00000010 00000002 00000000 
00000000
Call trace:
  vfs_coredump from get_signal+0x990/0xd9c
  get_signal from do_work_pending+0x118/0x588
  do_work_pending from slow_work_pending+0xc/0x24
Exception stack(0xf1249fb0 to 0xf1249ff8)
9fa0:                                     00000000 bee9b9d4 00000000 
00000008
9fc0: 00000000 b6f43968 bee9b9d4 000000af bee9bf70 b6ebdc94 00440f90 
00000000
9fe0: 00000020 bee9b9d0 ffffffff b6d2166c 00000010 00000002
Code: e1a03006 e5966004 e5930000 f57ff05b (e5837000)
---[ end trace 0000000000000000 ]---
note: cgm-release-age[1367] exited with irqs disabled


> ---
>   fs/coredump.c | 20 ++++++++++----------
>   1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index e2611fb1f254..c46e3996ff91 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -518,27 +518,28 @@ static int zap_threads(struct task_struct *tsk,
>   	return nr;
>   }
>   
> -static int coredump_wait(int exit_code, struct core_state *core_state)
> +static int coredump_wait(int exit_code)
>   {
>   	struct task_struct *tsk = current;
> +	struct core_state core_state;
>   	int core_waiters = -EBUSY;
>   
> -	init_completion(&core_state->startup);
> -	core_state->dumper.task = tsk;
> -	core_state->dumper.next = NULL;
> +	init_completion(&core_state.startup);
> +	core_state.dumper.task = tsk;
> +	core_state.dumper.next = NULL;
>   
> -	core_waiters = zap_threads(tsk, core_state, exit_code);
> +	core_waiters = zap_threads(tsk, &core_state, exit_code);
>   	if (core_waiters > 0) {
>   		struct core_thread *ptr;
>   
> -		wait_for_completion_state(&core_state->startup,
> +		wait_for_completion_state(&core_state.startup,
>   					  TASK_UNINTERRUPTIBLE|TASK_FREEZABLE);
>   		/*
>   		 * Wait for all the threads to become inactive, so that
>   		 * all the thread context (extended register state, like
>   		 * fpu etc) gets copied to the memory.
>   		 */
> -		ptr = core_state->dumper.next;
> +		ptr = core_state.dumper.next;
>   		while (ptr != NULL) {
>   			wait_task_inactive(ptr->task, TASK_ANY);
>   			ptr = ptr->next;
> @@ -858,7 +859,7 @@ static bool coredump_sock_request(struct core_name *cn, struct coredump_params *
>   	return coredump_sock_mark(cprm->file, COREDUMP_MARK_REQACK);
>   }
>   
> -static bool coredump_socket(struct core_name *cn, struct coredump_params *cprm)
> +static noinline_for_stack bool coredump_socket(struct core_name *cn, struct coredump_params *cprm)
>   {
>   	if (!coredump_sock_connect(cn, cprm))
>   		return false;
> @@ -1095,7 +1096,6 @@ void vfs_coredump(const kernel_siginfo_t *siginfo)
>   {
>   	struct cred *cred __free(put_cred) = NULL;
>   	size_t *argv __free(kfree) = NULL;
> -	struct core_state core_state;
>   	struct core_name cn;
>   	struct mm_struct *mm = current->mm;
>   	struct linux_binfmt *binfmt = mm->binfmt;
> @@ -1131,7 +1131,7 @@ void vfs_coredump(const kernel_siginfo_t *siginfo)
>   	if (coredump_force_suid_safe(&cprm))
>   		cred->fsuid = GLOBAL_ROOT_UID;
>   
> -	if (coredump_wait(siginfo->si_signo, &core_state) < 0)
> +	if (coredump_wait(siginfo->si_signo) < 0)
>   		return;
>   
>   	old_cred = override_creds(cred);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ