linux-kernel - Re: [PATCH] tracing: Correct the length check which causes memory corruption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <71fa2e69-a60b-0795-5fef-31658f89591a@linux.alibaba.com>
Date:   Mon, 7 Jun 2021 21:46:02 +0800
From:   James Wang <jnwang@...ux.alibaba.com>
To:     Liangyan <liangyan.peng@...ux.alibaba.com>,
        linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>
Cc:     Xunlei Pang <xlpang@...ux.alibaba.com>, yinbinbin@...babacloud.com,
        wetp <wetp.zy@...ux.alibaba.com>, stable@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] tracing: Correct the length check which causes memory
 corruption

Hi list,

The originally reproduce command could help you to verify quickly;

#!/bin/bash
stress-ng --all 2  --class filesystem -x 
chattr,chdir,chmod,chown,symlink,sync-file,utime,verity,xattr --log-file 
./stress.log

After inspection, I believe this key stressor should be:

stress-ng  --dirdeep 10

It will create a lots of files that with very long PATH; It could make 
some of OOB issue;

My test box is ~100cores Intel platform.


James


在 2021/6/7 PM8:57, Liangyan 写道:
> We've suffered from severe kernel crashes due to memory corruption on
> our production environment, like,
>
> Call Trace:
> [1640542.554277] general protection fault: 0000 [#1] SMP PTI
> [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G
> [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190
> [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286
> [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX:
> 0000000006e931bf
> [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI:
> ffff9a45ff004300
> [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09:
> 0000000000000000
> [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffffffff9a20608d
> [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15:
> 696c662f65636976
> [1640542.563128] FS:  00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000)
> knlGS:0000000000000000
> [1640542.563937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4:
> 00000000003606e0
> [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [1640542.566742] Call Trace:
> [1640542.567009]  anon_vma_clone+0x5d/0x170
> [1640542.567417]  __split_vma+0x91/0x1a0
> [1640542.567777]  do_munmap+0x2c6/0x320
> [1640542.568128]  vm_munmap+0x54/0x70
> [1640542.569990]  __x64_sys_munmap+0x22/0x30
> [1640542.572005]  do_syscall_64+0x5b/0x1b0
> [1640542.573724]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [1640542.575642] RIP: 0033:0x7f45d6e61e27
>
> James Wang has reproduced it stably on the latest 4.19 LTS.
> After some debugging, we finally proved that it's due to ftrace
> buffer out-of-bound access using a debug tool as follows:
> [   86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000
> [   86.780806]  no_context+0xdf/0x3c0
> [   86.784327]  __do_page_fault+0x252/0x470
> [   86.788367]  do_page_fault+0x32/0x140
> [   86.792145]  page_fault+0x1e/0x30
> [   86.795576]  strncpy_from_unsafe+0x66/0xb0
> [   86.799789]  fetch_memory_string+0x25/0x40
> [   86.804002]  fetch_deref_string+0x51/0x60
> [   86.808134]  kprobe_trace_func+0x32d/0x3a0
> [   86.812347]  kprobe_dispatcher+0x45/0x50
> [   86.816385]  kprobe_ftrace_handler+0x90/0xf0
> [   86.820779]  ftrace_ops_assist_func+0xa1/0x140
> [   86.825340]  0xffffffffc00750bf
> [   86.828603]  do_sys_open+0x5/0x1f0
> [   86.832124]  do_syscall_64+0x5b/0x1b0
> [   86.835900]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> commit b220c049d519 ("tracing: Check length before giving out
> the filter buffer") adds length check to protect trace data
> overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent
> overflow entirely, the length check should also take the sizeof
> entry->array[0] into account, since this array[0] is filled the
> length of trace data and occupy addtional space and risk overflow.
>
> Cc: stable@...r.kernel.org
> Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer")
> Signed-off-by: Liangyan <liangyan.peng@...ux.alibaba.com>
> Reviewed-by: Xunlei Pang <xlpang@...ux.alibaba.com>
> Reviewed-by: yinbinbin <yinbinbin@...babacloud.com>
> Reviewed-by: Wetp Zhang <wetp.zy@...ux.alibaba.com>
> Tested-by: James Wang <jnwang@...ux.alibaba.com>
> Cc: Xunlei Pang <xlpang@...ux.alibaba.com>
> Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> ---
>   kernel/trace/trace.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index a21ef9cd2aae..9299057feb56 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -2736,7 +2736,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
>   	    (entry = this_cpu_read(trace_buffered_event))) {
>   		/* Try to use the per cpu buffer first */
>   		val = this_cpu_inc_return(trace_buffered_event_cnt);
> -		if ((len < (PAGE_SIZE - sizeof(*entry))) && val == 1) {
> +		if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) {
>   			trace_event_setup(entry, type, trace_ctx);
>   			entry->array[0] = len;
>   			return entry;