lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 9 Apr 2024 22:49:22 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Zheng Yejian <zhengyejian1@...wei.com>
Cc: <naveen.n.rao@...ux.ibm.com>, <anil.s.keshavamurthy@...el.com>,
 <davem@...emloft.net>, <linux-kernel@...r.kernel.org>,
 <linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] kprobes: Avoid possible warn in
 __arm_kprobe_ftrace()

On Tue, 9 Apr 2024 14:20:45 +0800
Zheng Yejian <zhengyejian1@...wei.com> wrote:

> On 2024/4/8 20:41, Masami Hiramatsu (Google) wrote:
> > Hi Zheng,
> > 
> > On Mon, 8 Apr 2024 16:34:03 +0800
> > Zheng Yejian <zhengyejian1@...wei.com> wrote:
> > 
> >> There is once warn in __arm_kprobe_ftrace() on:
> >>
> >>   ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0);
> >>   if (WARN_ONCE(..., "Failed to arm kprobe-ftrace at %pS (error %d)\n", ...)
> >>     return ret;
> >>
> >> This warning is generated because 'p->addr' is detected to be not a valid
> >> ftrace location in ftrace_set_filter_ip(). The ftrace address check is done
> >> by check_ftrace_location() at the beginning of check_kprobe_address_safe().
> >> At that point, ftrace_location(addr) == addr should return true if the
> >> module is loaded. Then the module is searched twice:
> >>    1. in is_module_text_address(), we find that 'p->addr' is in a module;
> >>    2. in __module_text_address(), we find the module;
> >>
> >> If the module has just been unloaded before the second search, then
> >> '*probed_mod' is NULL and we would not go to get the module refcount,
> >> then the return value of check_kprobe_address_safe() would be 0, but
> >> actually we need to return -EINVAL.
> > 
> > OK, so you found a race window in check_kprobe_address_safe().
> > 
> > It does something like below.
> > 
> > check_kprobe_address_safe() {
> > 	...
> > 
> > 	/* Timing [A] */
> > 
> > 	if (!(core_kernel_text(p->addr) ||
> > 		is_module_text_address(p->addr)) ||
> > 		...(other reserved address check)) {
> > 		return -EINVAL;
> > 	}
> > 
> > 	/* Timing [B] */
> > 
> > 	*probed_mod = __module_text_address(p->addr):
> > 	if (*probe_mod) {
> > 		if (!try_module_get(*probed_mod)) {
> > 			return -ENOENT;
> > 		}
> > 		...	
> > 	}
> > }
> > 
> > So, if p->addr is in a module which is alive at the timing [A], but
> > unloaded at timing [B], 'p->addr' is passed the
> > 'is_module_text_address(p->addr)' check, but *probed_mod becomes NULL.
> > Thus the corresponding module is not referenced and kprobe_arm(p) will
> > access a wrong address (use after free).
> > This happens either kprobe on ftrace is enabled or not.
> 
> Yes, This is the problem. And for this case, check_kprobe_address_safe() 
> still return 0, and then going on to arm kprobe may cause problems. So
> we should make check_kprobe_address_safe() return -EINVAL when refcount
> of the module is not got.

Yes,

> 
> > 
> > To fix this problem, we should move the mutex_lock(kprobe_mutex) before
> > check_kprobe_address_safe() because kprobe_module_callback() also lock it
> > so it can stop module unloading.
> > 
> > Can you ensure this will fix your problem?
> 
> It seems not, the warning in __arm_kprobe_ftrace() still occurs. I
> contrived following simple test:
> 
>      #!/bin/bash
>      sysctl -w kernel.panic_on_warn=1
>      while [ True ]; do
>          insmod mod.ko    # contain function 'foo'
>          rmmod mod.ko
>      done &
>      while [ True ]; do
>          insmod kprobe.ko  # register kprobe on function 'foo'
>          rmmod kprobe.ko
>      done &
> 
> I think holding kprobe_mutex cannot make sure we get the refcount of the
> module.

Aah, yes, it cannot, because the kallsyms in a module will be removed
after module->state becomes MODULE_STATE_UNFORMED. Before UNFORMED,
the state is MODULE_STATE_GOING and the kprobe_module_callback() is
called at that point. Thus, the following scenario happens.

             CPU1 					CPU2

mod->state = MODULE_STATE_GOING
kprobe_module_callback() {
	mutex_lock(&kprobe_mutex)
		loop on kprobe_table 
		to disable kprobe in the module.
	mutex_unlock(&kprobe_mutex)
}
						register_kprobe(p) {
							mutex_lock(&kprobe_mutex)
							check_kprobe_address_safe(p->addr) {
								[A'']
								is_module_text_address() return true
								until mod->state == UNFORMED.
mod->state = MODULE_STATE_UNFORMED
								[B'']
								__module_text_address() returns NULL.
							}
							p is on the kprobe_table.
							mutex_unlock(&kprobe_mutex)

So, as your fix, if we save the module at [A''] and use it at [B''],
the mod is NOT able to get because mod->state != MODULE_STATE_LIVE.


> 
> > I think your patch is just optimizing but not fixing the fundamental
> > problem, which is we don't have an atomic search symbol and get module
> 
> Sorry, this patch is a little confusing, but it is not just optimizing :)
> 
> As shown below, after my patch, if p->addr is in a module which is alive
> at the timing [A'] but unloaded at timing [B'], then *probed_mod must
> not be NULL. Then after timing [B'], it will go to try_module_get() and
> expected to fail and return -ENOENT. So this is the different.
> 
>      check_kprobe_address_safe() {
>          ...
>          *probed_mod = NULL;
>          if (!core_kernel_text((unsigned long) p->addr)) {
> 
>              /* Timing [A'] */
> 
>              *probed_mod = __module_text_address((unsigned long) p->addr);
>              if (!(*probed_mod)) {
>                  return -EINVAL;
>              }
>          }
>          ...
> 
>          /* Timing [B'] */
> 
>          if (*probed_mod) {
>              if (!try_module_get(*probed_mod)) {
>                  return -ENOENT;
>              }
>              ...
>          }

OK, I got it. Hmm, but this is a bit long story to explain, the
root cause is the delay of module unloading process. So more
precisely, we can explain it as below.

----
When unloading a module, its state is changing MODULE_STATE_LIVE -> 
 MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take
a time. `is_module_text_address()` and `__module_text_address()`
works with MODULE_STATE_LIVE and MODULE_STATE_GOING.
If we use `is_module_text_address()` and `__module_text_address()`
separately, there is a chance that the first one is succeeded but the
next one is failed because module->state becomes MODULE_STATE_UNFORMED
between those operations.

In `check_kprobe_address_safe()`, if the second `__module_text_address()`
is failed, that is ignored because it expected a kernel_text address.
But it may have failed simply because module->state has been changed
to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify
non-exist module text address (use-after-free).

To fix this problem, we should not use separated `is_module_text_address()`
and `__module_text_address()`, but use only `__module_text_address()` once
and do `try_module_get(module)` which is only available with
MODULE_STATE_LIVE.
----

Would it be good for you too? The code itself looks good to me now :-)

Thank you!

> 
> > API. In that case, we should stop a whole module unloading system until
> > registering a new kprobe on a module. (After registering the kprobe,
> > the callback can mark it gone and disarm_kprobe does not work anymore.)
> > 
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 9d9095e81792..94eaefd1bc51 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1633,11 +1633,11 @@ int register_kprobe(struct kprobe *p)
> >   	p->nmissed = 0;
> >   	INIT_LIST_HEAD(&p->list);
> >   
> > +	mutex_lock(&kprobe_mutex);
> > +
> >   	ret = check_kprobe_address_safe(p, &probed_mod);
> >   	if (ret)
> > -		return ret;
> > -
> > -	mutex_lock(&kprobe_mutex);
> > +		goto out;
> >   
> >   	if (on_func_entry)
> >   		p->flags |= KPROBE_FLAG_ON_FUNC_ENTRY;
> > 
> > ----
> > 
> > Thank you,
> > 
> >>
> >> To fix it, originally we can simply check 'p->addr' is out of text again,
> >> like below. But that would check twice respectively in kernel text and
> >> module text, so finally I reduce them to be once.
> >>
> >>    if (!(core_kernel_text((unsigned long) p->addr) ||
> >>        is_module_text_address((unsigned long) p->addr)) || ...) {
> >> 	ret = -EINVAL;
> >> 	goto out;
> >>    }
> >>    ...
> >>    *probed_mod = __module_text_address((unsigned long) p->addr);
> >>    if (*probed_mod) {
> >> 	...
> >>    } else if (!core_kernel_text((unsigned long) p->addr)) { // check again!
> >> 	ret = -EINVAL;
> >> 	goto out;
> >>    }
> >>
> >> Signed-off-by: Zheng Yejian <zhengyejian1@...wei.com>
> >> ---
> >>   kernel/kprobes.c | 18 ++++++++++++------
> >>   1 file changed, 12 insertions(+), 6 deletions(-)
> >>
> >> v2:
> >>   - Update commit messages and comments as suggested by Masami.
> >>     Link: https://lore.kernel.org/all/20240408115038.b0c85767bf1f249eccc32fff@kernel.org/
> >>
> >> v1:
> >>   - Link: https://lore.kernel.org/all/20240407035904.2556645-1-zhengyejian1@huawei.com/
> >>
> >> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> >> index 9d9095e81792..65adc815fc6e 100644
> >> --- a/kernel/kprobes.c
> >> +++ b/kernel/kprobes.c
> >> @@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >>   	jump_label_lock();
> >>   	preempt_disable();
> >>   
> >> -	/* Ensure it is not in reserved area nor out of text */
> >> -	if (!(core_kernel_text((unsigned long) p->addr) ||
> >> -	    is_module_text_address((unsigned long) p->addr)) ||
> >> -	    in_gate_area_no_mm((unsigned long) p->addr) ||
> >> +	/* Ensure the address is in a text area, and find a module if exists. */
> >> +	*probed_mod = NULL;
> >> +	if (!core_kernel_text((unsigned long) p->addr)) {
> >> +		*probed_mod = __module_text_address((unsigned long) p->addr);
> >> +		if (!(*probed_mod)) {
> >> +			ret = -EINVAL;
> >> +			goto out;
> >> +		}
> >> +	}
> >> +	/* Ensure it is not in reserved area. */
> >> +	if (in_gate_area_no_mm((unsigned long) p->addr) ||
> >>   	    within_kprobe_blacklist((unsigned long) p->addr) ||
> >>   	    jump_label_text_reserved(p->addr, p->addr) ||
> >>   	    static_call_text_reserved(p->addr, p->addr) ||
> >> @@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p,
> >>   		goto out;
> >>   	}
> >>   
> >> -	/* Check if 'p' is probing a module. */
> >> -	*probed_mod = __module_text_address((unsigned long) p->addr);
> >> +	/* Get module refcount and reject __init functions for loaded modules. */
> >>   	if (*probed_mod) {
> >>   		/*
> >>   		 * We must hold a refcount of the probed module while updating
> >> -- 
> >> 2.25.1
> >>
> >
> --
> Thanks
> Zheng Yejian
> > 
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ