lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Dec 2015 18:28:35 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Steven Rostedt <rostedt@...dmis.org>
CC:	"Qiu, PeiyangX" <peiyangx.qiu@...el.com>,
	linux-kernel@...r.kernel.org, mingo@...hat.com,
	Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod


On 2015/12/16 1:37, Steven Rostedt wrote:
> On Tue, 15 Dec 2015 11:26:41 +0800
> "Zhang, Yanmin" <yanmin_zhang@...ux.intel.com> wrote:
>
>>> This seems very hackish, although I can't think of a better way at the
>>> moment. But I would like not to add more code into module.c if
>>> possible, and just use a notifier unless there's a real reason we can't
>>> (as there was when we added the hook in the first place).
>>> We would double-check it again. It's not a good idea to change common
>>> module codes.  
>> I double-checked it. If using notifier, we have to add 2 new module notification
>> events in enum module_state.
>> For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
>> At MODULE_STATE_INITING, call ftrace_module_init(mod);
>> At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar function);
>> At MODULE_STATE_COMING, call a new function which links the new start_pg
>> to ftrace_pages.
>> Such design can also fix another bug in current kernel that ftrace start_pg
>> of the module is not cleaned up if complete_formation fails.
>> However, we change module common codes. The new events only serve ftrace.
>>
>> If not holding ftrace_mutex across start_pg setup and complete_formation,
>> some other variables are not protected well, such like ftrace_pages, 
>> ftrace_start_up, and so on.
> OK, that was basically the reason the hook was added in the first place.
>
> The reason the modules notifier doesn't work here is that there's too
> many exits from the module code that may leave the mutex held.
>
> I thought about this some more. So the issue is that we add the module
> functions to the ftrace records. Then another task enables function
> tracing before the module is fully set up. As ftrace is enabling
> function tracing, the module sets its text to read-only, outside the
> scope of ftrace setting all text to read-write. When ftrace gets to the
> module code, it fails to do the modifications and you get the
> ftrace_bug() splat.
>
> Sounds right?
>
> Now, I really dislike the taking of the ftrace mutex and returning back
> to module handling. That is very subtle, fragile and prone to bugs. I
> took a step back to find another solution and I think I found one.
>
> Take the below patch and add to it (I'll keep this patch as mine, and
> you can submit another patch to do the following). You don't need to
> resend this patch, just send me a patch that does this:
>
>
> Add the module notifier that calls ftrace_module_enable(mod), removing
> it from ftrace_init_module(). Feel free to monkey with that function to
> be able to be called directly if it can't already.
>
> What my patch does is to create a new ftrace record flag DISABLED,
> which is set to all functions in a module as it is added. Then later on
> (in the module notifier that you will add), the flag is cleared. In
> between, the module functions will be ignored.
>
>
> If the module errors out and the notifier is not called, we don't care.
> The records will simply be removed, and the flag will be meaningless.
>
> I needed to move the code around to still let the module code get
> enabled if tracing was already enabled. I even cleaned that code up as
> I found it has gotten a bit messy.
>
> -- Steve
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 4736a826baf5..660e7c698f3b 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -357,6 +357,7 @@ bool is_ftrace_trampoline(unsigned long addr);
>   *  REGS    - the record wants the function to save regs
>   *  REGS_EN - the function is set up to save regs.
>   *  IPMODIFY - the record allows for the IP address to be changed.
> + *  DISABLED - the record is not ready to be touched yet
>   *
>   * When a new ftrace_ops is registered and wants a function to save
>   * pt_regs, the rec->flag REGS is set. When the function has been
> @@ -371,10 +372,11 @@ enum {
>  	FTRACE_FL_TRAMP		= (1UL << 28),
>  	FTRACE_FL_TRAMP_EN	= (1UL << 27),
>  	FTRACE_FL_IPMODIFY	= (1UL << 26),
> +	FTRACE_FL_DISABLED	= (1UL << 25),
>  };
>  
> -#define FTRACE_REF_MAX_SHIFT	26
> -#define FTRACE_FL_BITS		6
> +#define FTRACE_REF_MAX_SHIFT	25
> +#define FTRACE_FL_BITS		7
>  #define FTRACE_FL_MASKED_BITS	((1UL << FTRACE_FL_BITS) - 1)
>  #define FTRACE_FL_MASK		(FTRACE_FL_MASKED_BITS << FTRACE_REF_MAX_SHIFT)
>  #define FTRACE_REF_MAX		((1UL << FTRACE_REF_MAX_SHIFT) - 1)
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index e290a30f2d0b..e7e15874fb7c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2023,6 +2023,9 @@ static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
>  
>  	ftrace_bug_type = FTRACE_BUG_UNKNOWN;
>  
> +	if (rec->flags & FTRACE_FL_DISABLED)
> +		return FTRACE_UPDATE_IGNORE;
> +
>  	/*
>  	 * If we are updating calls:
>  	 *
> @@ -2849,64 +2852,41 @@ ops_references_rec(struct ftrace_ops *ops, struct dyn_ftrace *rec)
>  	return 1;
>  }
>  
> -static int referenced_filters(struct dyn_ftrace *rec)
> -{
> -	struct ftrace_ops *ops;
> -	int cnt = 0;
> -
> -	for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> -		if (ops_references_rec(ops, rec))
> -		    cnt++;
> -	}
> -
> -	return cnt;
> -}
> -
>  static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
>  {
>  	struct ftrace_page *pg;
>  	struct dyn_ftrace *p;
>  	cycle_t start, stop;
>  	unsigned long update_cnt = 0;
> -	unsigned long ref = 0;
> -	bool test = false;
> +	unsigned long rec_flags = 0;
>  	int i;
>  
> +	start = ftrace_now(raw_smp_processor_id());
> +
>  	/*
> -	 * When adding a module, we need to check if tracers are
> -	 * currently enabled and if they are set to trace all functions.
> -	 * If they are, we need to enable the module functions as well
> -	 * as update the reference counts for those function records.
> +	 * When a module is loaded, this function is called to convert
> +	 * the calls to mcount in its text to nops, and also to create
> +	 * an entry in the ftrace data. Now, if ftrace is activated
> +	 * after this call, but before the module sets its text to
> +	 * read-only, the modification of enabling ftrace can fail if
> +	 * the read-only is done while ftrace is converting the calls.
> +	 * To prevent this, the module's records are set as disabled
> +	 * and will be enabled after the call to set the module's text
> +	 * to read-only.
>  	 */
> -	if (mod) {
> -		struct ftrace_ops *ops;
> -
> -		for (ops = ftrace_ops_list;
> -		     ops != &ftrace_list_end; ops = ops->next) {
> -			if (ops->flags & FTRACE_OPS_FL_ENABLED) {
> -				if (ops_traces_mod(ops))
> -					ref++;
> -				else
> -					test = true;
> -			}
> -		}
> -	}
> -
> -	start = ftrace_now(raw_smp_processor_id());
> +	if (mod)
> +		rec_flags |= FTRACE_FL_DISABLED;
>  
>  	for (pg = new_pgs; pg; pg = pg->next) {
>  
>  		for (i = 0; i < pg->index; i++) {
> -			int cnt = ref;
>  
>  			/* If something went wrong, bail without enabling anything */
>  			if (unlikely(ftrace_disabled))
>  				return -1;
>  
>  			p = &pg->records[i];
> -			if (test)
> -				cnt += referenced_filters(p);
> -			p->flags = cnt;
> +			p->flags = rec_flags;
>  
>  			/*
>  			 * Do the initial record conversion from mcount jump
> @@ -2916,21 +2896,6 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
>  				break;
>  
>  			update_cnt++;
> -
> -			/*
> -			 * If the tracing is enabled, go ahead and enable the record.
> -			 *
> -			 * The reason not to enable the record immediatelly is the
> -			 * inherent check of ftrace_make_nop/ftrace_make_call for
> -			 * correct previous instructions.  Making first the NOP
> -			 * conversion puts the module to the correct state, thus
> -			 * passing the ftrace_make_call check.
> -			 */
> -			if (ftrace_start_up && cnt) {
> -				int failed = __ftrace_replace_code(p, 1);
> -				if (failed)
> -					ftrace_bug(failed, p);
> -			}
>  		}
>  	}
>  
> @@ -4938,6 +4903,19 @@ static int ftrace_process_locs(struct module *mod,
>  
>  #define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)
>  
> +static int referenced_filters(struct dyn_ftrace *rec)
> +{
> +	struct ftrace_ops *ops;
> +	int cnt = 0;
> +
> +	for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> +		if (ops_references_rec(ops, rec))
> +		    cnt++;
> +	}
> +
> +	return cnt;
> +}
> +
>  void ftrace_release_mod(struct module *mod)
>  {
>  	struct dyn_ftrace *rec;
> @@ -4980,12 +4958,82 @@ void ftrace_release_mod(struct module *mod)
>  	mutex_unlock(&ftrace_lock);
>  }
>  
> +static void ftrace_module_enable(struct module *mod)
> +{
> +	struct dyn_ftrace *rec;
> +	struct ftrace_page *pg;
> +
> +	mutex_lock(&ftrace_lock);
> +
> +	if (ftrace_disabled)
> +		goto out_unlock;
> +
> +	/*
> +	 * If the tracing is enabled, go ahead and enable the record.
> +	 *
> +	 * The reason not to enable the record immediatelly is the
> +	 * inherent check of ftrace_make_nop/ftrace_make_call for
> +	 * correct previous instructions.  Making first the NOP
> +	 * conversion puts the module to the correct state, thus
> +	 * passing the ftrace_make_call check.
> +	 *
> +	 * We also delay this to after the module code already set the
> +	 * text to read-only, as we now need to set it back to read-write
> +	 * so that we can modify the text.
> +	 */
> +	if (ftrace_start_up)
> +		ftrace_arch_code_modify_prepare();
> +
> +	do_for_each_ftrace_rec(pg, rec) {
> +		int cnt;
> +		/*
> +		 * do_for_each_ftrace_rec() is a double loop.
> +		 * module text shares the pg. If a record is
> +		 * not part of this module, then skip this pg,
> +		 * which the "break" will do.
> +		 */
> +		if (!within_module_core(rec->ip, mod))
> +			break;
> +
> +		cnt = 0;
> +
> +		/*
> +		 * When adding a module, we need to check if tracers are
> +		 * currently enabled and if they are, and can trace this record,
> +		 * we need to enable the module functions as well as update the
> +		 * reference counts for those function records.
> +		 */
> +		if (ftrace_start_up)
> +			cnt += referenced_filters(rec);
> +
> +		/* This clears FTRACE_FL_DISABLED */
> +		rec->flags = cnt;
> +
> +		if (ftrace_start_up && cnt) {
> +			int failed = __ftrace_replace_code(rec, 1);

If we choose to call ftrace_module_enable when receiving module notification
  MODULE_STATE_COMING, TEXT section of the module is already changed to RO.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ