[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120718171409.GJ25563@hmsreliant.think-freely.org>
Date: Wed, 18 Jul 2012 13:14:09 -0400
From: Neil Horman <nhorman@...driver.com>
To: John Fastabend <john.r.fastabend@...el.com>
Cc: davem@...emloft.net, gaofeng@...fujitsu.com,
mark.d.rustad@...el.com, netdev@...r.kernel.org,
eric.dumazet@...il.com
Subject: Re: [RFC PATCH] net: cgroup: null ptr dereference in netprio cgroup
during init
On Wed, Jul 18, 2012 at 08:14:40AM -0700, John Fastabend wrote:
> On 7/18/2012 7:21 AM, John Fastabend wrote:
> >On 7/18/2012 5:45 AM, Neil Horman wrote:
> >>On Tue, Jul 17, 2012 at 05:33:16PM -0700, John Fastabend wrote:
> >>>When the netprio cgroup is built in the kernel cgroup_init will call
> >>>cgrp_create which eventually calls update_netdev_tables. This is
> >>>being called before do_initcalls() so a null ptr dereference occurs
> >>>on init_net.
> >>>
> >>>This patch adds a check on init_net.count to verify the structure
> >>>has been initialized. The failure was introduced here,
> >>>
> >>>commit ef209f15980360f6945873df3cd710c5f62f2a3e
> >>>Author: Gao feng <gaofeng@...fujitsu.com>
> >>>Date: Wed Jul 11 21:50:15 2012 +0000
> >>>
> >>> net: cgroup: fix access the unallocated memory in netprio cgroup
> >>>
> >>>Tested with ping with netprio_cgroup as a module and built in.
> >>>
> >>>Marked RFC for now I think DaveM might have a reason why this needs
> >>>some improvement.
> >>>
> >>>Reported-by: Mark Rustad <mark.d.rustad@...el.com>
> >>>Cc: Neil Horman <nhorman@...driver.com>
> >>>Cc: Eric Dumazet <edumazet@...gle.com>
> >>>Cc: Gao feng <gaofeng@...fujitsu.com>
> >>>Signed-off-by: John Fastabend <john.r.fastabend@...el.com>
> >>>---
> >>>
> >>> net/core/netprio_cgroup.c | 3 +++
> >>> 1 files changed, 3 insertions(+), 0 deletions(-)
> >>>
> >>>diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> >>>index b2e9caa..e9fd7fd 100644
> >>>--- a/net/core/netprio_cgroup.c
> >>>+++ b/net/core/netprio_cgroup.c
> >>>@@ -116,6 +116,9 @@ static int update_netdev_tables(void)
> >>> u32 max_len;
> >>> struct netprio_map *map;
> >>>
> >>>+ if (!atomic_read(&init_net.count))
> >>>+ return ret;
> >>>+
> >>> rtnl_lock();
> >>> max_len = atomic_read(&max_prioidx) + 1;
> >>> for_each_netdev(&init_net, dev) {
> >>>
> >>>
> >>
> >>John, do you have a stack trace of this. I'm having a hard time
> >>seeing how we
> >>get into this path prior to the network stack being initalized.
> >
> >Mark had a partial trace
> >
> >[ 0.003455] Dentry cache hash table entries: 262144 (order: 9,
> >2097152 bytes)
> >[ 0.005550] Inode-cache hash table entries: 131072 (order: 8, 1048576
> >bytes)
> >[ 0.007165] Mount-cache hash table entries: 256
> >[ 0.010289] Initializing cgroup subsys net_cls
> >[ 0.010947] Initializing cgroup subsys net_prio
> >[ 0.011039] BUG: unable to handle kernel NULL pointer dereference at
> >0000000000000828
> >[ 0.011998] IP: [<ffffffff814202c8>] update_netdev_tables+0x68/0xe0
> >
> >
> >>
> >>It also brings up another point. If this is happening, and we're
> >>creating the
> >>root cgroup from start_kernel, Then we're actually initalizing some
> >>cgroups
> >>twice, because a few cgroups register themselves via
> >>cgroup_load_subsys in
> >>module_init specified routines. So if you're building netprio_cgroup or
> >>net_cls_cgroup as part of the monolithic kernel, you'll get
> >>cgroup_create called
> >>prior to your module_init() call. Thats not good.
> >
> >Well your module_init() wouldn't be called in this case right? I think
> >netprio has a bug where we only register a netdevice notifier when
> >its built as a module.
> >
> >same issue with cls_cgroup and register_tcf_proto_ops?
> >
>
> Neil, I was very unclear in the above. What I meant here was
> cgroup_load_subsys() checks ss->module so you should _not_
> get two create calls. And returns 0 so the register calls for
> netdev notifiers should get setup.
>
> I missed the return 0 part and so I thought we might abort before
> this occurs but it looks ok to me on second glance.
>
John, et al.
Just so we all have it, I've got the problem reproduced here, and it gives me
this backtrace:
0.149924] Mount-cache hash table entries: 256
[ 0.163754] Initializing cgroup subsys cpuacct
[ 0.176991] Initializing cgroup subsys memory
[ 0.190012] Initializing cgroup subsys devices
[ 0.203249] Initializing cgroup subsys freezer
[ 0.216484] Initializing cgroup subsys net_cls
[ 0.229719] Initializing cgroup subsys blkio
[ 0.242436] Initializing cgroup subsys perf_event
[ 0.256451] Initializing cgroup subsys net_prio
[ 0.269948] BUG: unable to handle kernel NULL pointer dereference at
0000000000000698
[ 0.293303] IP: [<ffffffff81512e37>] cgrp_create+0x107/0x1c0
[ 0.310175] PGD 0
[ 0.316157] Oops: 0000 [#1] SMP
[ 0.325775] CPU 0
[ 0.331227] Modules linked in:
[ 0.340846]
[ 0.345264] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7+ #1 AMD Dinar/Dinar
[ 0.366555] RIP: 0010:[<ffffffff81512e37>] [<ffffffff81512e37>]
cgrp_create+0x107/0x1c0
[ 0.390681] RSP: 0000:ffffffff81c01ea8 EFLAGS: 00010213
[ 0.406501] RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
[ 0.427764] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81c9d840
[ 0.449026] RBP: ffffffff81c01ed8 R08: 00000000000164e0 R09: 0000000000000000
[ 0.470289] R10: ffff8804278303c0 R11: 0000000000000000 R12: 0000000000000001
[ 0.491553] R13: ffff8804278303c0 R14: ffff881036fd0700 R15: 0000000000000000
[ 0.512819] FS: 0000000000000000(0000) GS:ffff880427c00000(0000)
knlGS:0000000000000000
[ 0.536932] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 0.554049] CR2: 0000000000000698 CR3: 0000000001c0b000 CR4: 00000000000406b0
[ 0.575311] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.596574] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 0.617838] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task
ffffffff81c13420)
[ 0.642471] Stack:
[ 0.648442] ffffffff81c01eb8 ffffffff81c9f320 ffffffff81c9f320
ffffffff81c9f320
[ 0.670522] ffffffff81c9f320 ffffffff81d482c0 ffffffff81c01ef8
ffffffff81d10397
[ 0.692604] ffffffff81e99790 0000000000000048 ffffffff81c01f18
ffffffff81d1062e
[ 0.714687] Call Trace:
[ 0.721960] [<ffffffff81d10397>] cgroup_init_subsys+0x51/0xdf
[ 0.739337] [<ffffffff81d1062e>] cgroup_init+0x36/0x119
[ 0.755160] [<ffffffff81cf5c02>] start_kernel+0x38f/0x3c4
[ 0.771501] [<ffffffff81cf5672>] ? repair_env_string+0x5e/0x5e
[ 0.789138] [<ffffffff81cf5356>] x86_64_start_reservations+0x131/0x135
[ 0.808849] [<ffffffff81cf545a>] x86_64_start_kernel+0x100/0x10f
[ 0.827003] Code: 10 ff ff ff 75 25 e9 89 00 00 00 66 0f 1f 84 00 00 00 00 00
48 8b 93 f0 00 00 00 48 81 fa 38 39 f9 81 48 8d 9a 10 ff ff ff 74 69 <48> 8b 93
88 07 00 00 48 85 d2 74 dd 44 3b 62 10 76 d7 48 8d bb
[ 0.883860] RIP [<ffffffff81512e37>] cgrp_create+0x107/0x1c0
[ 0.900988] RSP <ffffffff81c01ea8>
[ 0.911366] CR2: 0000000000000698
[ 0.921235] ---[ end trace a7919e7f17c0a725 ]---
So yes, it appears to me that we're calling cgrp_create from cgroup_init_subsys
prior to having the module_init routine called for netprio_cgroup. It seems to
me that (given that we have a cgroup_early_init patch), we can move the
cgroup_init call until later in the boot process. I'll spend the some time in
the next few weeks tinkering with that.
Best
Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists