netdev - RE: [RFC] Handling device free after a packet is passed to the network stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <004401d06af4$94620280$bd260780$@codeaurora.org>
Date:	Mon, 30 Mar 2015 08:19:47 -0600
From:	"Harout Hedeshian" <harouth@...eaurora.org>
To:	"'Eric Dumazet'" <eric.dumazet@...il.com>,
	<subashab@...eaurora.org>
Cc:	<netdev@...r.kernel.org>
Subject: RE: [RFC] Handling device free after a packet is passed to the network stack



> -----Original Message-----
> From: netdev-owner@...r.kernel.org [mailto:netdev-owner@...r.kernel.org]
> On Behalf Of Eric Dumazet
> Sent: Friday, March 27, 2015 8:49 PM
> To: subashab@...eaurora.org
> Cc: netdev@...r.kernel.org
> Subject: Re: [RFC] Handling device free after a packet is passed to the
> network stack
> 
> On Fri, 2015-03-27 at 19:57 +0000, subashab@...eaurora.org wrote:
> > We have been coming across a couple of scenarios where the device is
> > freed and the corresponding packets which were already queued up the
> > stack encounter crashes when they find that contents of skb->dev are
> > no longer valid.
> >
> > Specifically, we have observed an instance where a cpu hotplug occurs
> > along with the network driver module unloading. When the packets are
> > being queued up the stack using netif_rx_ni from dev_cpu_callback,
> > get_rps_cpus crashes as it encounters invalid data at skb->dev since
> > it would have been freed.
> >
> > We would like to know if the kernel provides some mechanisms to
> > safeguard against such scenarios.
> 
> This is supposed to be handled in flush_backlog() (net/core/dev.c)

It looks like this only takes care of packets in the softnet_data queues (or am I missing something). Wouldn't we run into similar issues if the packet were stuck in, for example, TCP OFO socket queue (not sure if anything tries to access skb->dev after that)? Or perhaps some other queue which may later try to dereference skb->dev. 
 

It seems to me that the only way to be absolutely sure that there is no reference on this net_device would be to have the drivers dev_hold() before earch netif_rx/netif_rx_ni/netif_receive_skb; and then later have __kfree_skb() do a dev_put().

Of course, this approach has a number of issues:
- 2 extra operations per packet
- Anything holding onto an SKB can prevent a net_device from deregistering/freeing
- requires skbs to have a valid dev before they can be freed (or we simply null check this field in __kfree_skb())

> Maybe you hit some race condition in on_each_cpu().
> 
> Really, I do not think we have tested such stress conditions ever.
> 
> cpu hotplug was very unstable 2 years ago (last time I tried it, this
> was constantly failing outside of networking land)
> 
> I would try following patch, although I am not sure what would prevent
> dev_cpu_callback() being called too late.
> 
> diff --git a/net/core/dev.c b/net/core/dev.c index
> a0408d497dae04e7caa145f05c915b058aa2d356..fa74887adfe10709bce55ccf1e34d6
> 5c6b7a8fba 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -7149,11 +7149,17 @@ static int dev_cpu_callback(struct
> notifier_block *nfb,
> 
>  	/* Process offline CPU's input_pkt_queue */
>  	while ((skb = __skb_dequeue(&oldsd->process_queue))) {
> -		netif_rx_ni(skb);
> +		if (skb->dev->reg_state != NETREG_REGISTERED)
> +			kfree_skb(skb);
> +		else
> +			netif_rx_ni(skb);
>  		input_queue_head_incr(oldsd);
>  	}
>  	while ((skb = skb_dequeue(&oldsd->input_pkt_queue))) {
> -		netif_rx_ni(skb);
> +		if (skb->dev->reg_state != NETREG_REGISTERED)
> +			kfree_skb(skb);
> +		else
> +			netif_rx_ni(skb);
>  		input_queue_head_incr(oldsd);
>  	}
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the
> body of a message to majordomo@...r.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html