lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121021062402.7c4c4cb8@sacrilege>
Date:	Sun, 21 Oct 2012 06:24:02 +0600
From:	Mike Kazantsev <mk.fraggod@...il.com>
To:	Paul Moore <paul@...l-moore.com>
Cc:	netdev@...r.kernel.org, linux-mm@...ck.org
Subject: Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup"
 (xfrm) in 3.5+ kernels

On Sun, 21 Oct 2012 04:45:40 +0600
Mike Kazantsev <mk.fraggod@...il.com> wrote:

> 
> kmemleak mechanism seem to provide stack traces and interesting calls
> for debugging of whatever is allocating the non-freed objects, so guess
> I'll see if I can get more definitive (to my ignorant eye) "look here"
> hint from it, and might drop one more mail with data from there.
> 

kmemleak finds a lot (dozens megabytes of stack traces) of identical
paths leading to a leaks:

(for IPv6 packets)
unreferenced object 0xffff88002fa25b00 (size 56):
  comm "softirq", pid 0, jiffies 4295009073 (age 295.620s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 fc 6e 30 00 88 ff ff  ..........n0....
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:
    [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
    [<ffffffff8147df39>] xfrm_input+0x64/0x484
    [<ffffffff814b1d2c>] xfrm6_rcv_spi+0x19/0x1b
    [<ffffffff814b1d4e>] xfrm6_rcv+0x20/0x22
    [<ffffffff8148c19f>] ip6_input_finish+0x203/0x31b
    [<ffffffff8148c622>] ip6_input+0x1e/0x50
    [<ffffffff8148c31c>] ip6_rcv_finish+0x65/0x69
    [<ffffffff8148c5a3>] ipv6_rcv+0x283/0x2e4
    [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
    [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
    [<ffffffff81400644>] napi_skb_finish+0x21/0x53
    [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
    [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
    [<ffffffff813ffcda>] net_rx_action+0x9f/0x175

(for IPv4 packets)
unreferenced object 0xffff88003387e000 (size 56):
  comm "softirq", pid 0, jiffies 4294915803 (age 563.583s)
  hex dump (first 32 bytes):
    01 00 00 00 01 00 00 00 00 48 be 30 00 88 ff ff  .........H.0....
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  backtrace:
    [<ffffffff814cfa2b>] kmemleak_alloc+0x21/0x3e
    [<ffffffff810d9445>] kmem_cache_alloc+0xa5/0xb1
    [<ffffffff8147dd35>] secpath_dup+0x1b/0x5a
    [<ffffffff8147df39>] xfrm_input+0x64/0x484
    [<ffffffff81474f7b>] xfrm4_rcv_encap+0x17/0x19
    [<ffffffff81474f9c>] xfrm4_rcv+0x1f/0x21
    [<ffffffff81430514>] ip_local_deliver_finish+0x170/0x22a
    [<ffffffff81430706>] ip_local_deliver+0x46/0x78
    [<ffffffff8143038d>] ip_rcv_finish+0x2bd/0x2d4
    [<ffffffff81430969>] ip_rcv+0x231/0x28c
    [<ffffffff813ff8ba>] __netif_receive_skb+0x599/0x64c
    [<ffffffff813ffb08>] netif_receive_skb+0x47/0x78
    [<ffffffff81400644>] napi_skb_finish+0x21/0x53
    [<ffffffff81400778>] napi_gro_receive+0x102/0x10e
    [<ffffffff8136978b>] rtl8169_poll+0x326/0x4f9
    [<ffffffff813ffcda>] net_rx_action+0x9f/0x175

Object at the top and trace seem to be the same (between same
IP-family) everywhere, just ages and addresses are different.

IPv6 usage seem to be one important detail which I failed to mention.
IPv4 traces seem to be really rare (only several of them), but that
might be understandable because rsync was ran over IPv6.

Still wasn't able to figure out what might cause the get's/put's
disbalance with that commit, but was able to revert it, without
anything bad happening (so far), using the patch below (in case
issue might bite someone else before proper fix is found).


--

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..52a9d40 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,26 +427,8 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				   unsigned int length, gfp_t gfp_mask)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
-			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
-	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
-		void *data;
-
-		if (sk_memalloc_socks())
-			gfp_mask |= __GFP_MEMALLOC;
-
-		data = __netdev_alloc_frag(fragsz, gfp_mask);
-
-		if (likely(data)) {
-			skb = build_skb(data, fragsz);
-			if (unlikely(!skb))
-				put_page(virt_to_head_page(data));
-		}
-	} else {
-		skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
+	skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
 				  SKB_ALLOC_RX, NUMA_NO_NODE);
-	}
 	if (likely(skb)) {
 		skb_reserve(skb, NET_SKB_PAD);
 		skb->dev = dev;


-- 
Mike Kazantsev // fraggod.net

Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ