[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1350856053.8609.217.camel@edumazet-glaptop>
Date: Sun, 21 Oct 2012 23:47:33 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Mike Kazantsev <mk.fraggod@...il.com>
Cc: Paul Moore <paul@...l-moore.com>, netdev@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: PROBLEM: Memory leak (at least with SLUB) from "secpath_dup"
(xfrm) in 3.5+ kernels
On Mon, 2012-10-22 at 01:51 +0600, Mike Kazantsev wrote:
> On Mon, 22 Oct 2012 00:43:32 +0600
> Mike Kazantsev <mk.fraggod@...il.com> wrote:
>
> > > On Sun, 21 Oct 2012 15:29:43 +0200
> > > Eric Dumazet <eric.dumazet@...il.com> wrote:
> > >
> > > >
> > > > Did you try linux-3.7-rc2 (or linux-3.7-rc1) ?
> > > >
> >
> > I just built "torvalds/linux-2.6" (v3.7-rc2) and rebooted into it,
> > started same rsync-over-net test and got kmalloc-64 leaking (it went up
> > to tens of MiB until I stopped rsync, normally these are fixed at ~500
> > KiB).
> >
> > Unfortunately, I forgot to add slub_debug option and build kmemleak so
> > wasn't able to look at this case further, and when I rebooted with
> > these enabled/built, it was secpath_cache again.
> >
> > So previously noted "slabtop showed 'kmalloc-64' being the 99% offender
> > in the past, but with recent kernels (3.6.1), it has changed to
> > 'secpath_cache'" seem to be incorrect, as it seem to depend not on
> > kernel version, but some other factor.
> >
> > Guess I'll try to reboot a few more times to see if I can catch
> > kmalloc-64 leaking (instead of secpath_cache) again.
> >
>
> I haven't been able to catch the aforementioned condition, but noticed
> that with v3.7-rc2, "hex dump" part seem to vary in kmemleak
> traces, and contain all sorts of random stuff, for example:
>
> unreferenced object 0xffff88002ae2de00 (size 56):
> comm "softirq", pid 0, jiffies 4295006317 (age 213.066s)
> hex dump (first 32 bytes):
> 01 00 00 00 01 00 00 00 20 9f f4 28 00 88 ff ff ........ ..(....
> 2f 6f 72 67 2f 66 72 65 65 64 65 73 6b 74 6f 70 /org/freedesktop
> backtrace:
> [<ffffffff814da4e3>] kmemleak_alloc+0x21/0x3e
> [<ffffffff810dc1f7>] kmem_cache_alloc+0xa5/0xb1
> [<ffffffff81487bf1>] secpath_dup+0x1b/0x5a
> [<ffffffff81487df5>] xfrm_input+0x64/0x484
> [<ffffffff814bbd70>] xfrm6_rcv_spi+0x19/0x1b
> [<ffffffff814bbd92>] xfrm6_rcv+0x20/0x22
> [<ffffffff814960c3>] ip6_input_finish+0x203/0x31b
> [<ffffffff81496542>] ip6_input+0x1e/0x50
> [<ffffffff81496240>] ip6_rcv_finish+0x65/0x69
> [<ffffffff814964c3>] ipv6_rcv+0x27f/0x2e0
> [<ffffffff8140a659>] __netif_receive_skb+0x5ba/0x65a
> [<ffffffff8140a894>] netif_receive_skb+0x47/0x78
> [<ffffffff8140b4bf>] napi_skb_finish+0x21/0x54
> [<ffffffff8140b5ef>] napi_gro_receive+0xfd/0x10a
> [<ffffffff81372b47>] rtl8169_poll+0x326/0x4fc
> [<ffffffff8140ad44>] net_rx_action+0x9f/0x188
>
> Not sure if it's relevant though.
>
>
OK, so some layer seems to have a bug if the skb->head is exactly
allocated, instead of having extra tailroom (because of kmalloc-powerof2
alignment)
Or some layer overwrites past skb->cb[] array
If you try to move sp field in sk_buff, does it change something ?
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6a2c34e..9b1438a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -395,6 +395,9 @@ struct sk_buff {
struct sock *sk;
struct net_device *dev;
+#ifdef CONFIG_XFRM
+ struct sec_path *sp;
+#endif
/*
* This is the control buffer. It is free to use for every
* layer. Please put your private variables there. If you
@@ -404,9 +407,6 @@ struct sk_buff {
char cb[48] __aligned(8);
unsigned long _skb_refdst;
-#ifdef CONFIG_XFRM
- struct sec_path *sp;
-#endif
unsigned int len,
data_len;
__u16 mac_len,
Also try to increase tailroom in __netdev_alloc_skb()
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6e04b1f..972ee4f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -427,7 +427,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
unsigned int length, gfp_t gfp_mask)
{
struct sk_buff *skb = NULL;
- unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+ unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD + 64) +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists