lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1273129187.2304.14.camel@edumazet-laptop>
Date:	Thu, 06 May 2010 08:59:47 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Martín Ferrari <martin.ferrari@...il.com>,
	Arnd Bergmann <arnd@...db.de>
Cc:	netdev <netdev@...r.kernel.org>,
	Mathieu Lacage <mathieu.lacage@...hia.inria.fr>,
	David Miller <davem@...emloft.net>
Subject: Re: kernel panic when using netns+bridges+tc(netem)

Le jeudi 06 mai 2010 à 08:47 +0200, Eric Dumazet a écrit :
> Le jeudi 06 mai 2010 à 08:40 +0200, Eric Dumazet a écrit :
> > Le jeudi 06 mai 2010 à 03:01 +0200, Martín Ferrari a écrit :
> > > Hi there,
> > > 
> > > While working on my project that uses netns, I found another bug. This
> > > one causes a "Kernel panic - not syncing: Fatal exception in
> > > interrupt", and I can reproduce it in 2.6.33 and 2.6.34-rc5, but not
> > > in 2.6.32. It dies during a call to __free_skb.
> > > I tested this on my x86_64 laptop (2 cores) and on qemu. In qemu it
> > > was not triggered until I asked it to emulate 2 cpus instead of one,
> > > so it is probably a SMP-only issue.
> > > 
> > > Scenario:
> > > 
> > > I set up a number of network namespaces, each with two veths to netns
> > > 1. In the main namespace I take those veths and bridge them in pairs,
> > > to configure a linear topology; also I configure the netem qdisc to
> > > simulate link delay.
> > > 
> > > Once the network is set up, I run a client/server program to send UDP
> > > packets from one end of the topology to the other. After a few seconds
> > > of sending packets (not really deterministic) it panics.
> > > 
> > > Note that I didn't experience this problem when using only 2
> > > namespaces (so, no routing)
> > > 
> > > below the dumps. These all come from the qemu, as I couldn't use
> > > netconsole in the network at work, but I checked and the backtraces
> > > were essentially the same
> > > 
...
> > Could you please try following patch ?
> > 
> > Thanks
> > 
> > [PATCH] veth: Dont kfree_skb() after dev_forward_skb()
> > 
> > In case of congestion, dev_forward_skb() already free the skb
> > 
> > Reported-by: Martín Ferrari <martin.ferrari@...il.com>
> > Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
> > ---
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index f9f0730..5ec542d 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -187,7 +187,6 @@ tx_drop:
> >  	return NETDEV_TX_OK;
> >  
> >  rx_drop:
> > -	kfree_skb(skb);
> >  	rcv_stats->rx_dropped++;
> >  	return NETDEV_TX_OK;
> >  }
> > 
> 
> Hmm, scratch that one, I'll resubmit a proper fix in few minutes
> 
> (We must change dev_forward_skb() too)
> 

David, this is a stable candidate, once tested and acked, thanks !

[PATCH] veth: Dont kfree_skb() after dev_forward_skb()

In case of congestion, netif_rx() frees the skb, so we must assume
dev_forward_skb() also consume skb.

Bug introduced by commit 445409602c092
(veth: move loopback logic to common location)

We must change dev_forward_skb() to always consume skb, and veth to not
double free it.

Bug report : http://marc.info/?l=linux-netdev&m=127310770900442&w=3

Reported-by: Martín Ferrari <martin.ferrari@...il.com>
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
 drivers/net/veth.c |    1 -
 net/core/dev.c     |   11 +++++------
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index f9f0730..5ec542d 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -187,7 +187,6 @@ tx_drop:
 	return NETDEV_TX_OK;
 
 rx_drop:
-	kfree_skb(skb);
 	rcv_stats->rx_dropped++;
 	return NETDEV_TX_OK;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index f769098..264137f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1451,7 +1451,7 @@ static inline void net_timestamp(struct sk_buff *skb)
  *
  * return values:
  *	NET_RX_SUCCESS	(no congestion)
- *	NET_RX_DROP     (packet was dropped)
+ *	NET_RX_DROP     (packet was dropped, but freed)
  *
  * dev_forward_skb can be used for injecting an skb from the
  * start_xmit function of one device into the receive queue
@@ -1465,12 +1465,11 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb)
 {
 	skb_orphan(skb);
 
-	if (!(dev->flags & IFF_UP))
-		return NET_RX_DROP;
-
-	if (skb->len > (dev->mtu + dev->hard_header_len))
+	if (!(dev->flags & IFF_UP) ||
+	    (skb->len > (dev->mtu + dev->hard_header_len))) {
+		kfree_skb(skb);
 		return NET_RX_DROP;
-
+	}
 	skb_set_dev(skb, dev);
 	skb->tstamp.tv64 = 0;
 	skb->pkt_type = PACKET_HOST;


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ