lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 08 Sep 2017 15:41:42 -0500
From:   Tom Zanussi <tom.zanussi@...ux.intel.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     tglx@...utronix.de, mhiramat@...nel.org, namhyung@...nel.org,
        vedang.patel@...el.com, bigeasy@...utronix.de,
        joel.opensrc@...il.com, joelaf@...gle.com,
        mathieu.desnoyers@...icios.com, baohong.liu@...el.com,
        linux-kernel@...r.kernel.org, linux-rt-users@...r.kernel.org
Subject: Re: [PATCH v2 40/40] tracing: Add trace_event_buffer_reserve()
 variant that allows recursion

On Fri, 2017-09-08 at 16:27 -0400, Steven Rostedt wrote:
> On Tue,  5 Sep 2017 16:57:52 -0500
> Tom Zanussi <tom.zanussi@...ux.intel.com> wrote:
> 
> > Synthetic event generation requires the reservation of a second event
> > while the reservation of a previous event is still in progress.  The
> > trace_recursive_lock() check in ring_buffer_lock_reserve() prevents
> > this however.
> > 
> > This sets up a special reserve pathway for this particular case,
> > leaving existing pathways untouched, other than an additional check in
> > ring_buffer_lock_reserve() and trace_event_buffer_reserve().  These
> > checks could be gotten rid of as well, with copies of those functions,
> > but for now try to avoid that unless necessary.
> > 
> > Signed-off-by: Tom Zanussi <tom.zanussi@...ux.intel.com>
> 
> I've been planing on changing that lock, which may help you here
> without having to mess around with parameters. That is to simply add a
> counter. Would this patch help you. You can add a patch to increment
> the count to 5 with an explanation of handling synthetic events, but
> even getting to 4 is extremely unlikely.
> 
> I'll make this into an official patch if this works for you, and then
> you can include it in your series.
> 

Yes, this should definitely work, and in fact I considered something
similar at one point.  Thanks for doing this, it's much simpler than I
would have ended up with!

Tom


> -- Steve
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 0bcc53e..9dbb459 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -2582,61 +2582,29 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
>   * The lock and unlock are done within a preempt disable section.
>   * The current_context per_cpu variable can only be modified
>   * by the current task between lock and unlock. But it can
> - * be modified more than once via an interrupt. To pass this
> - * information from the lock to the unlock without having to
> - * access the 'in_interrupt()' functions again (which do show
> - * a bit of overhead in something as critical as function tracing,
> - * we use a bitmask trick.
> + * be modified more than once via an interrupt. There are four
> + * different contexts that we need to consider.
>   *
> - *  bit 0 =  NMI context
> - *  bit 1 =  IRQ context
> - *  bit 2 =  SoftIRQ context
> - *  bit 3 =  normal context.
> + *  Normal context.
> + *  SoftIRQ context
> + *  IRQ context
> + *  NMI context
>   *
> - * This works because this is the order of contexts that can
> - * preempt other contexts. A SoftIRQ never preempts an IRQ
> - * context.
> - *
> - * When the context is determined, the corresponding bit is
> - * checked and set (if it was set, then a recursion of that context
> - * happened).
> - *
> - * On unlock, we need to clear this bit. To do so, just subtract
> - * 1 from the current_context and AND it to itself.
> - *
> - * (binary)
> - *  101 - 1 = 100
> - *  101 & 100 = 100 (clearing bit zero)
> - *
> - *  1010 - 1 = 1001
> - *  1010 & 1001 = 1000 (clearing bit 1)
> - *
> - * The least significant bit can be cleared this way, and it
> - * just so happens that it is the same bit corresponding to
> - * the current context.
> + * If for some reason the ring buffer starts to recurse, we
> + * only allow that to happen at most 4 times (one for each
> + * context). If it happens 5 times, then we consider this a
> + * recusive loop and do not let it go further.
>   */
>  
>  static __always_inline int
>  trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
>  {
> -	unsigned int val = cpu_buffer->current_context;
> -	int bit;
> -
> -	if (in_interrupt()) {
> -		if (in_nmi())
> -			bit = RB_CTX_NMI;
> -		else if (in_irq())
> -			bit = RB_CTX_IRQ;
> -		else
> -			bit = RB_CTX_SOFTIRQ;
> -	} else
> -		bit = RB_CTX_NORMAL;
> -
> -	if (unlikely(val & (1 << bit)))
> +	if (cpu_buffer->current_context >= 4)
>  		return 1;
>  
> -	val |= (1 << bit);
> -	cpu_buffer->current_context = val;
> +	cpu_buffer->current_context++;
> +	/* Interrupts must see this update */
> +	barrier();
>  
>  	return 0;
>  }
> @@ -2644,7 +2612,9 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
>  static __always_inline void
>  trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
>  {
> -	cpu_buffer->current_context &= cpu_buffer->current_context - 1;
> +	/* Don't let the dec leak out */
> +	barrier();
> +	cpu_buffer->current_context--;
>  }
>  
>  /**


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ