netdev - [PATCH] net, cgroup: Fix boot failure due to iteration of uninitialized list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120719162532.23505.85946.stgit@srivatsabhat.in.ibm.com>
Date:	Thu, 19 Jul 2012 21:57:37 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	gaofeng@...fujitsu.com, eric.dumazet@...il.com,
	nhorman@...driver.com, davem@...emloft.net
Cc:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	mark.d.rustad@...el.com, john.r.fastabend@...el.com,
	lizefan@...wei.com
Subject: [PATCH] net,
 cgroup: Fix boot failure due to iteration of uninitialized list

After commit ef209f15 (net: cgroup: fix access the unallocated memory in
netprio cgroup), boot fails with the following NULL pointer dereference:

Initializing cgroup subsys devices
Initializing cgroup subsys freezer
Initializing cgroup subsys net_cls
Initializing cgroup subsys blkio
Initializing cgroup subsys perf_event
Initializing cgroup subsys net_prio
BUG: unable to handle kernel NULL pointer dereference at 0000000000000698
IP: [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
PGD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in:

Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7-mandeep #1 IBM IBM System x -[7870C4Q]-/68Y8033
RIP: 0010:[<ffffffff8145e8d6>]  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
RSP: 0000:ffffffff81a01ea8  EFLAGS: 00010213
RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81aa70a0
RBP: ffffffff81a01ed8 R08: 0000000000000000 R09: 0000000000000000
R10: ffff8808ff8641c0 R11: 6e697a696c616974 R12: 0000000000000001
R13: ffff8808ff8641c0 R14: 0000000000000000 R15: 0000000000093970
FS:  0000000000000000(0000) GS:ffff8808ffc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000698 CR3: 0000000001a0b000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a13420)
Stack:
 ffffffff81a01eb8 ffffffff818060ff ffffffff81d75ec8 ffffffff81aa8960
 ffffffff81aa8960 ffffffff81b4c2c0 ffffffff81a01ef8 ffffffff81b1cb78
 0000000000000018 0000000000000048 ffffffff81a01f18 ffffffff81b1ce13
Call Trace:
 [<ffffffff81b1cb78>] cgroup_init_subsys+0x83/0x169
 [<ffffffff81b1ce13>] cgroup_init+0x36/0x119
 [<ffffffff81affef7>] start_kernel+0x3ba/0x3ef
 [<ffffffff81aff95b>] ? kernel_init+0x27b/0x27b
 [<ffffffff81aff356>] x86_64_start_reservations+0x131/0x136
 [<ffffffff81aff45e>] x86_64_start_kernel+0x103/0x112
Code: 01 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 75 1b eb 73 0f 1f 00 48 8b 83 f0 00 00 00 48 3d f8 e1 ec 81 48 8d 98 10 ff ff ff 74 5a <48> 8b 83 88 07 00 00 48 85 c0 74 de 44 3b 60 10 76 d8 44 89 e6
RIP  [<ffffffff8145e8d6>] cgrp_create+0xf6/0x190
 RSP <ffffffff81a01ea8>
CR2: 0000000000000698
---[ end trace a7919e7f17c0a725 ]---
Kernel panic - not syncing: Attempted to kill the idle task!

The code corresponds to:

update_netdev_tables():
        for_each_netdev(&init_net, dev) {
                map = rtnl_dereference(dev->priomap);  <---- HERE


The list head is initialized in netdev_init(), which is called much
later than cgrp_create(). So the problem is that we are calling
update_netdev_tables() way too early (in cgrp_create()), which will
end up traversing the not-yet-circular linked list. So at some point,
the dev pointer will become NULL and hence dev->priomap becomes an
invalid access.

To fix this, just remove the update_netdev_tables() function entirely,
since it appears that write_update_netdev_table() will handle things
just fine.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
---

Requesting a thorough review of this patch, since I am not sure whether
removing update_netdev_tables() is perfectly OK and whether that is the
right thing to do.

 net/core/netprio_cgroup.c |   32 --------------------------------
 1 files changed, 0 insertions(+), 32 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index b2e9caa..68fe239 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -109,32 +109,6 @@ static int write_update_netdev_table(struct net_device *dev)
 	return ret;
 }
 
-static int update_netdev_tables(void)
-{
-	int ret = 0;
-	struct net_device *dev;
-	u32 max_len;
-	struct netprio_map *map;
-
-	rtnl_lock();
-	max_len = atomic_read(&max_prioidx) + 1;
-	for_each_netdev(&init_net, dev) {
-		map = rtnl_dereference(dev->priomap);
-		/*
-		 * don't allocate priomap if we didn't
-		 * change net_prio.ifpriomap (map == NULL),
-		 * this will speed up skb_update_prio.
-		 */
-		if (map && map->priomap_len < max_len) {
-			ret = extend_netdev_table(dev, max_len);
-			if (ret < 0)
-				break;
-		}
-	}
-	rtnl_unlock();
-	return ret;
-}
-
 static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 {
 	struct cgroup_netprio_state *cs;
@@ -153,12 +127,6 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
 		goto out;
 	}
 
-	ret = update_netdev_tables();
-	if (ret < 0) {
-		put_prioidx(cs->prioidx);
-		goto out;
-	}
-
 	return &cs->css;
 out:
 	kfree(cs);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html