linux-kernel - Re: [PATCH] tracing: add trace_event_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090520043833.GB6925@linux.vnet.ibm.com>
Date:	Tue, 19 May 2009 21:38:33 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH] tracing: add trace_event_read_lock()

On Wed, May 20, 2009 at 02:59:50AM +0200, Frederic Weisbecker wrote:
> On Tue, May 19, 2009 at 05:38:53AM -0700, Paul E. McKenney wrote:
> > On Tue, May 19, 2009 at 01:15:44PM +0800, Lai Jiangshan wrote:
> > > Paul E. McKenney wrote:
> > > > On Mon, May 18, 2009 at 03:59:31PM +0200, Frederic Weisbecker wrote:
> > > >> On Mon, May 18, 2009 at 07:35:34PM +0800, Lai Jiangshan wrote:
> > > >>> I found that there is nothing to protect event_hash in
> > > >>> ftrace_find_event().
> > > >> Actually, rcu protects it, but not enough. We have neither
> > > >> synchronize_rcu() nor rcu_read_lock.
> > > >>
> > > >> So we protect against concurrent hlist accesses.
> > > >> But the event can be removed when a module is unloaded,
> > > >> and that can happen between the time we get the event output
> > > >> callback and the time we actually use it.
> > > > 
> > > > I will ask the stupid question...  Would invoking rcu_barrier() in the
> > > > module-exit function take care of this?  The rcu_barrier() primitive
> > > > waits for all in-flight RCU callbacks to complete execution.
> > > > 
> > > > 							Thanx, Paul
> > > > 
> > > 
> > > We have no call_rcu* in it.
> > 
> > OK, I will ask the next stupid question...  Is it possible to use the
> > same trick rcu_barrier() uses, but for your events?
> > 
> > 							Thanx, Paul
> 
> 
> Hi Paul,
> 
> I think you're right, we could use RCU to solve this race.
> 
> The current race window to solve is as follows:
> 
> 
> --Task A--
> 
> print_trace_line(trace) {
> 	event = find_ftrace_event(trace) {
> 			hlist_for_each_entry_rcu(....) {
> 				...
> 				return found;
> 		}
> 	}
> 
> 
> --Task B--
> 
> /*
>  * Can be quite confusing.
>  * But we have "trace events" which are global objects
>  * handling tracing, output, everything.
>  * And we have also "ftrace events" which are unique
>  * identifiers on traces than make them able to retrieve  
>  * their associated output callbacks.
>  * So a "trace event" has an associated "ftrace event" to output
>  * itself.
>  */
> 
> trace_module_remove_events(mod) {
> 	list_trace_events_module(ev, mod) {
> 		unregister_ftrace_event(ev->event) {
> 			// mutex protected
> 			hlist_del(ev->event->node)
> 			list_del(....)
> 			//
> 		}
> 	}
> }
> //module removed

Ouch!!!  ;-)

> -- Task A--
> 
> event->print(trace) 
> 
> KABOOM! print callback was on the module. event doesn't even exist anymore.
> 
> So Lai's solution is correct to fix it.
> But rcu is also valid.
> 
> Currently rcu only protects the hashlist but not the callback until
> the job is complete.
> 
> What we could do is:
> 
> (Read side)
> 
> rcu_read_lock();
> 
> event = find_ftrace_event();
> event->print(trace);

As long as event->print() never blocks...

Though you could use SRCU if it might block:

	rcu_read_lock()		->	i = srcu_read_lock(&my_srcu_struct);
	rcu_read_unlock()	->	srcu_read_unlock(&my_srcu_struct, i);
	synchronize_rcu()	->	synchronize_srcu(&my_srcu_struct);

> rcu_read_unlock()
> 
> (Write side)
> 
> unregister_ftrace_event(e)
> {
> 	mutex_lock(&writer);
> 	hlist_del(e);
> 	list_del(e);
> 	mutex_unlock(&writer);
> 
> 	// Wait for every printing grace periods
> 	synchronize_rcu();
> }
> 
> 
> Then the only overhead is a preempt_disable()/preempt_enable()
> for each trace to be printed. Just a sub/add pair and quite few
> preemption disabled latency, just the time for a hashlist search
> and some work done to format a trace to be printed.
> But if latency does really matters, we have RCU_PREEMPT.
> 
> On the other side, the down_read() is an extra call but it is called
> only once for a whole loop of traces seeking.
> 
> I guess both solutions are valid.
> What do you think?

It could go either way.  As a very rough rule of thumb, if the down_read()
happens seldom, there is little advantage to RCU.  On the other hand,
if the down_read() could happen often enough that a new reader shows up
before the previous down_read() completes, then you really need RCU or
something like it.

On most systems, if down_read() was happening more than a few million
times per second, life might be hard.  On the other hand, if down_read()
never happened more than a few tens of thousand times per second or so,
should be no problem.

Since it sounds like a single down_read() substitutes for a potentially
large number of RCU read-side critical sections, I do not believe that
contention-free read-side overhead would be a problem.

Does that help?

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/