lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160512164000.GA9815@breakpoint.cc>
Date:	Thu, 12 May 2016 18:40:00 +0200
From:	Florian Westphal <fw@...len.de>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Pablo Neira Ayuso <pablo@...filter.org>,
	Florian Westphal <fw@...len.de>,
	netfilter-devel@...r.kernel.org, dale.4d@...il.com,
	netdev@...r.kernel.org
Subject: Re: [PATCH nf V2] netfilter: fix oops in nfqueue during netns error
 unwinding

Eric W. Biederman <ebiederm@...ssion.com> wrote:
> > On Wed, May 11, 2016 at 05:41:13PM +0200, Florian Westphal wrote:
> >> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> >> index 5baa8e2..9722819 100644
> >> --- a/net/netfilter/nf_queue.c
> >> +++ b/net/netfilter/nf_queue.c
> >> @@ -102,6 +102,13 @@ void nf_queue_nf_hook_drop(struct net *net, struct nf_hook_ops *ops)
> >>  {
> >>  	const struct nf_queue_handler *qh;
> >>  
> >> +	/* netns wasn't initialized, error unwind in progress.
> >> +	 * Its possible that the nfq netns init function was not even
> >> +	 * called, in which case nfq pernetns data is in undefined state.
> >> +	 */
> >> +	if (!net->list.next)
> >> +		return;
> >
> > Thanks Florian.
> >
> > Question for the netns folks: Is this a stable assumption? My only
> > concern with this is that it relies on internal the netns
> > representation, so I'd like to make sure this thing doesn't break
> > again.
> >
> > Let me know, thanks.
> 
> Ugh. I got distracted before I finished replying.
> 
> I don't think the analysis is correct.

I am afraid it is, see below for example

> The unwind and the netns exit path should maintain the same constraints.
> If they don't there is a bug, perhaps in initialization ordering.

Since 8405a8fff3f8545c888a872d6e3c0c8eecd4d348 any call to
nf_unregister_hook invokes nf_queue_nf_hook_drop().

If the nfnetlink_queue module is loaded, that means we call
nfqnl_nf_hook_drop function via the nf_queue nf_queue_handler struct
in nf_queue.c .

And since nfqnl_nf_hook_drop() uses net_generic() ...

> Code should be able to count on net_generic() from the beginning of the
> corresponding .init to the end of the corresponding .fini

... we cannot count on this.

Example:
1. we try to create new netns
2. net_namespace.c:ops_init walks the netns op list and calls ->init()
3. some of these init hooks register netfilter hooks
4. we call init hook for nfnetlink_queue, but this yields -EFOO (alternatively, another ->init handler before this failed)
5. netns init code invokes the ->exit() handlers in reverse order for unwinding
6. hooks get unregistered, which makes a call into nf_queue_nf_hook_drop
7. nf_queue then calls q->nfqnl_nf_hook_drop(net)
8. oops as nfnetlink_queue wasn't setup in this net ns and net_generic
returns some undefined result

> Something like that is not happening and if we can I would like to track
> down why.
> 
> Otherwise we need code like the above all over the place.

AFAICS no other callers do something similar, but yes,
we'd need this all over the place if there are others.

Maybe we need a saner fix, e.g. by adding bounds check to net_generic(),
and making sure that net_generic() returns non-NULL only if the per
netns memory was allocated properly.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ