lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 Apr 2011 13:16:20 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	"Serge E. Hallyn" <serge@...lyn.com>
Cc:	Will Drewry <wad@...omium.org>, linux-kernel@...r.kernel.org,
	kees.cook@...onical.com, eparis@...hat.com, agl@...omium.org,
	mingo@...e.hu, jmorris@...ei.org,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>, Michal Marek <mmarek@...e.cz>,
	Oleg Nesterov <oleg@...hat.com>,
	Roland McGrath <roland@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Jiri Slaby <jslaby@...e.cz>,
	David Howells <dhowells@...hat.com>
Subject: Re: [PATCH 3/7] seccomp_filter: Enable ftrace-based system call
 filtering

On Thu, 2011-04-28 at 11:55 -0500, Serge E. Hallyn wrote:

> ...
> 
> >  void __secure_computing(int this_syscall)
> >  {
> > -	int mode = current->seccomp.mode;
> > +	int mode = -1;
> >  	int * syscall;
> > -
> > +	/* Do we need an RCU read lock to access current's state? */
> 
> Nope.

Correct.

> > - out:
> > +	rcu_assign_pointer(current->seccomp.state, state);
> > +	synchronize_rcu();
> > +	put_seccomp_state(orig_state);  /* for the get */
> > +
> > +out:
> > +	put_seccomp_state(orig_state);  /* for the task */
> > +	return ret;
> > +
> > +free_state:
> > +	put_seccomp_state(orig_state);  /* for the get */
> > +	put_seccomp_state(state);  /* drop the dup */
> >  	return ret;
> >  }
> 
> This looks exactly right.  The only case where put_seccomp_state()
> might actually lead to freeing the state is where the current's
> state gets reassigned.  So you need to synchronize_rcu() before
> that (as you do).  The other cases will only decrement the usage
> counter, can race with a reader doing (inc; get) but not with a
> final free, which can only be done here.

Technically incorrect ;)

"final free, which can only be done here."

This is not the only place that a free will happen. But the code is
correct none-the-less.

Reader on another CPU ups the orig_state refcount under rcu_readlock,
but after it ups the refcount it releases the rcu_readlock and continues
to read this state.

Current on this CPU calls this function does the synchronize_rcu() and
calls put on the state. But since the reader still has a ref count on
it, it does not get freed here.

When the reader is finally done with the state it calls the put() which
does the final free on it.

The code still looks correct, I'm just nitpicking your analysis.

> 
> (Rambling above is just me pursuading myself)

Me rambling too.

> 
> > diff --git a/kernel/seccomp_filter.c b/kernel/seccomp_filter.c
> 
> Unfortunately your use of filters doesn't seem exactly right.
> 
> > +/* seccomp_copy_all_filters - copies all filters from src to dst.
> > + *
> > + * @dst: the list_head for seccomp_filters to populate.
> > + * @src: the list_head for seccomp_filters to copy from.
> > + * Returns non-zero on failure.
> > + */
> > +int seccomp_copy_all_filters(struct list_head *dst,
> > +			     const struct list_head *src)
> > +{
> > +	struct seccomp_filter *filter;
> > +	int ret = 0;
> > +	BUG_ON(!dst || !src);
> > +	if (list_empty(src))
> > +		goto done;
> > +	rcu_read_lock();
> > +	list_for_each_entry(filter, src, list) {
> > +		struct seccomp_filter *new_filter = copy_seccomp_filter(filter);
> 
> copy_seccomp_filter() causes kzalloc to be called.  You can't do that under
> rcu_read_lock().

Unless you change the kzalloc to do GFP_ATOMIC. Not sure I'd recommend
doing that.

> 
> I actually thought you were going to be more extreme about the seccomp
> state than you are:  I thought you were going to tie a filter list to
> seccomp state.  So adding or removing a filter would have required
> duping the seccomp state, duping all the filters, making the change in
> the copy, and then swapping the new state into place.  Slow in the
> hopefully rare update case, but safe.
> 
> You don't have to do that, but then I'm pretty sure you'll need to add
> reference counts to each filter and use rcu cycles to a reader from
> having the filter disappear mid-read.

Or you can preallocate the new filters, call rcu_read_lock(), check if
the number of old filters is the same or less, if more, call
rcu_read_unlock, and try allocating more, and then call rcu_read_lock()
again and repeat. Then just copy the filters to the preallocate ones.
rcu_read_unlock() and then free any unused allocated filters.

Maybe a bit messy, but not that bad.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ