lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20221124074252.1fced8fc9dd714acd0d6c1a5@kernel.org>
Date:   Thu, 24 Nov 2022 07:42:52 +0900
From:   Masami Hiramatsu (Google) <mhiramat@...nel.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Zheng Yejian <zhengyejian1@...wei.com>, stable@...r.kernel.org,
        Yujie Liu <yujie.liu@...el.com>
Subject: Re: [PATCH 2/2] tracing: Free buffers when a used dynamic event is
 removed

On Wed, 23 Nov 2022 14:25:58 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@...dmis.org>
> 
> After 65536 dynamic events have been added and removed, the "type" field
> of the event then uses the first type number that is available (not
> currently used by other events). A type number is the identifier of the
> binary blobs in the tracing ring buffer (known as events) to map them to
> logic that can parse the binary blob.
> 
> The issue is that if a dynamic event (like a kprobe event) is traced and
> is in the ring buffer, and then that event is removed (because it is
> dynamic, which means it can be created and destroyed), if another dynamic
> event is created that has the same number that new event's logic on
> parsing the binary blob will be used.
> 
> To show how this can be an issue, the following can crash the kernel:
> 
>  # cd /sys/kernel/tracing
>  # for i in `seq 65536`; do
>      echo 'p:kprobes/foo do_sys_openat2 $arg1:u32' > kprobe_events
>  # done
> 
> For every iteration of the above, the writing to the kprobe_events will
> remove the old event and create a new one (with the same format) and
> increase the type number to the next available on until the type number
> reaches over 65535 which is the max number for the 16 bit type. After it
> reaches that number, the logic to allocate a new number simply looks for
> the next available number. When an dynamic event is removed, that number
> is then available to be reused by the next dynamic event created. That is,
> once the above reaches the max number, the number assigned to the event in
> that loop will remain the same.
> 
> Now that means deleting one dynamic event and created another will reuse
> the previous events type number. This is where bad things can happen.
> After the above loop finishes, the kprobes/foo event which reads the
> do_sys_openat2 function call's first parameter as an integer.
> 
>  # echo 1 > kprobes/foo/enable
>  # cat /etc/passwd > /dev/null
>  # cat trace
>              cat-2211    [005] ....  2007.849603: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
>              cat-2211    [005] ....  2007.849620: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
>              cat-2211    [005] ....  2007.849838: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
>              cat-2211    [005] ....  2007.849880: foo: (do_sys_openat2+0x0/0x130) arg1=4294967196
>  # echo 0 > kprobes/foo/enable
> 
> Now if we delete the kprobe and create a new one that reads a string:
> 
>  # echo 'p:kprobes/foo do_sys_openat2 +0($arg2):string' > kprobe_events
> 
> And now we can the trace:
> 
>  # cat trace
>         sendmail-1942    [002] .....   530.136320: foo: (do_sys_openat2+0x0/0x240) arg1=             cat-2046    [004] .....   530.930817: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
>              cat-2046    [004] .....   530.930961: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
>              cat-2046    [004] .....   530.934278: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
>              cat-2046    [004] .....   530.934563: foo: (do_sys_openat2+0x0/0x240) arg1="������������������������������������������������������������������������������������������������"
>             bash-1515    [007] .....   534.299093: foo: (do_sys_openat2+0x0/0x240) arg1="kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk���������@��4Z����;Y�����U

Aah, right. Even if we remove one dynamic event, it was shown
as unknown events.

> 
> And dmesg has:
> 
> ==================================================================
> BUG: KASAN: use-after-free in string+0xd4/0x1c0
> Read of size 1 at addr ffff88805fdbbfa0 by task cat/2049
> 
>  CPU: 0 PID: 2049 Comm: cat Not tainted 6.1.0-rc6-test+ #641
>  Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x5b/0x77
>   print_report+0x17f/0x47b
>   kasan_report+0xad/0x130
>   string+0xd4/0x1c0
>   vsnprintf+0x500/0x840
>   seq_buf_vprintf+0x62/0xc0
>   trace_seq_printf+0x10e/0x1e0
>   print_type_string+0x90/0xa0
>   print_kprobe_event+0x16b/0x290
>   print_trace_line+0x451/0x8e0
>   s_show+0x72/0x1f0
>   seq_read_iter+0x58e/0x750
>   seq_read+0x115/0x160
>   vfs_read+0x11d/0x460
>   ksys_read+0xa9/0x130
>   do_syscall_64+0x3a/0x90
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>  RIP: 0033:0x7fc2e972ade2
>  Code: c0 e9 b2 fe ff ff 50 48 8d 3d b2 3f 0a 00 e8 05 f0 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
>  RSP: 002b:00007ffc64e687c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007fc2e972ade2
>  RDX: 0000000000020000 RSI: 00007fc2e980d000 RDI: 0000000000000003
>  RBP: 00007fc2e980d000 R08: 00007fc2e980c010 R09: 0000000000000000
>  R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020f00
>  R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
>   </TASK>
> 
>  The buggy address belongs to the physical page:
>  page:ffffea00017f6ec0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5fdbb
>  flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
>  raw: 000fffffc0000000 0000000000000000 ffffea00017f6ec8 0000000000000000
>  raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
>  page dumped because: kasan: bad access detected
> 
>  Memory state around the buggy address:
>   ffff88805fdbbe80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>   ffff88805fdbbf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>  >ffff88805fdbbf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>                                 ^
>   ffff88805fdbc000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>   ffff88805fdbc080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>  ==================================================================
> 
> This was found when Zheng Yejian sent a patch to convert the even type
> number assignment to use IDA, which gives the next available number, and
> this bug showed up in the fuzz testing by Yujie Liu and the kernel test
> robot. But after further analysis, I found that this behavior is the same
> as when the event type numbers go past the 16bit max (and the above shows
> that).
> 
> As modules have a similar issue, but is dealt with by setting a
> "WAS_ENABLED" flag when a module event is enabled, and when the module is
> freed, if any of its events were enabled, the ring buffer that holds that
> event is also cleared, to prevent reading stale events. The same can be
> done for dynamic events.

Indeed. If the dynamic event had not been enabled, there is no reason
to clear the buffer.

This looks good to me.

Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>

Thank you!

> 
> If any dynamic event that is being removed was enabled, then make sure the
> buffers they were enabled in are now cleared.
> 
> Link: https://lore.kernel.org/all/20221110020319.1259291-1-zhengyejian1@huawei.com/
> 
> Cc: stable@...r.kernel.org
> Depends-on: TBD ("tracing: Add tracing_reset_all_online_cpus_unlocked() function")
> Depends-on: 5448d44c38557 ("tracing: Add unified dynamic event framework")
> Depends-on: 6212dd29683ee ("tracing/kprobes: Use dyn_event framework for kprobe events")
> Depends-on: 065e63f951432 ("tracing: Only have rmmod clear buffers that its events were active in")
> Depends-on: 575380da8b469 ("tracing: Only clear trace buffer on module unload if event was traced")
> Fixes: 77b44d1b7c283 ("tracing/kprobes: Rename Kprobe-tracer to kprobe-event")
> Reported-by: Zheng Yejian <zhengyejian1@...wei.com>
> Reported-by: Yujie Liu <yujie.liu@...el.com>
> Reported-by: kernel test robot <yujie.liu@...el.com>
> Signed-off-by: Steven Rostedt (Google) <rostedt@...dmis.org>
> ---
>  kernel/trace/trace_dynevent.c |  2 ++
>  kernel/trace/trace_events.c   | 11 ++++++++++-
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c
> index 154996684fb5..4376887e0d8a 100644
> --- a/kernel/trace/trace_dynevent.c
> +++ b/kernel/trace/trace_dynevent.c
> @@ -118,6 +118,7 @@ int dyn_event_release(const char *raw_command, struct dyn_event_operations *type
>  		if (ret)
>  			break;
>  	}
> +	tracing_reset_all_online_cpus();
>  	mutex_unlock(&event_mutex);
>  out:
>  	argv_free(argv);
> @@ -214,6 +215,7 @@ int dyn_events_release_all(struct dyn_event_operations *type)
>  			break;
>  	}
>  out:
> +	tracing_reset_all_online_cpus();
>  	mutex_unlock(&event_mutex);
>  
>  	return ret;
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index 0449e3c7d327..3bfaf560ecc4 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -2947,7 +2947,10 @@ static int probe_remove_event_call(struct trace_event_call *call)
>  		 * TRACE_REG_UNREGISTER.
>  		 */
>  		if (file->flags & EVENT_FILE_FL_ENABLED)
> -			return -EBUSY;
> +			goto busy;
> +
> +		if (file->flags & EVENT_FILE_FL_WAS_ENABLED)
> +			tr->clear_trace = true;
>  		/*
>  		 * The do_for_each_event_file_safe() is
>  		 * a double loop. After finding the call for this
> @@ -2960,6 +2963,12 @@ static int probe_remove_event_call(struct trace_event_call *call)
>  	__trace_remove_event_call(call);
>  
>  	return 0;
> + busy:
> +	/* No need to clear the trace now */
> +	list_for_each_entry(tr, &ftrace_trace_arrays, list) {
> +		tr->clear_trace = false;
> +	}
> +	return -EBUSY;
>  }
>  
>  /* Remove an event_call */
> -- 
> 2.35.1
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ