[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <29C1DC0826876849BDD9F1C67ABA2943070F1534@ala-mail09.corp.ad.wrs.com>
Date: Wed, 24 Feb 2010 11:05:12 -0800
From: "Stephens, Allan" <allan.stephens@...driver.com>
To: "Neil Horman" <nhorman@...driver.com>
Cc: <netdev@...r.kernel.org>, <jon.maloy@...csson.com>,
<tipc-discussion@...ts.sourceforge.net>, <davem@...emloft.net>
Subject: RE: [PATCH]: tipc: Fix oops on send prior to entering networked mode
Hi Neil:
Have you tried upgrading your system to use TIPC 1.7.6 (available at
http://tipc.sourceforge.net/tipc_linux.html)? This is a significant
revised and enhanced version of TIPC that hasn't yet made its way into
mainsteam Linux, and seems to be the version-of-choice for most TIPC
users. It also appears to avoid a number of the issues that currently
exist in TIPC 1.6, including the one caused by the random configuration
command you mentioned in your email below.
I didn't have a problem with you working on a small patch for TIPC 1.6
to get around a limited problem, but I'd hate to see you waste time on
fixing issues that have already been addressed in TIPC 1.7.
Regards,
Al
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@...driver.com]
> Sent: Wednesday, February 24, 2010 1:54 PM
> To: Stephens, Allan
> Cc: netdev@...r.kernel.org; jon.maloy@...csson.com;
> tipc-discussion@...ts.sourceforge.net; davem@...emloft.net
> Subject: Re: [PATCH]: tipc: Fix oops on send prior to
> entering networked mode
>
> On Tue, Feb 23, 2010 at 08:21:25AM -0800, Stephens, Allan wrote:
> > Neil wrote:
> >
> > > I agree that you patch fixes the exact problem that I
> reported here,
> > > but theres more to it than that. A quick grep of the tipc stack
> > > reveals the following
> > > symbols:
> > > tipc_bearers
> > > media_list
> > > tipc_local_nodes
> > > bcbearer
> > > bclink
> > > tipc_net.zones
> > >
> > > All of these symbols:
> > >
> > > 1) Are allocated dynamically in tipc_net_start, _after_
> tipc_mode is
> > > set to TIPC_NET_MODE
> > >
> > > 2) dereferenced without NULL pointer checks in either the
> send path
> > > or in the netlink configuration path, both of which are reachable
> > > from user space.
> > >
> > > So your patch fixes the last item on your list, but what
> about the
> > > others? In fact, I'll bet I can very quickly change the
> application
> > > to trip over a null tipc_local_nodes dereference by changing the
> > > destination address to be something within zone 0, cluster 0.
> >
> > The semantics of TIPC addressing don't allow a node address of the
> > form <0.0.N> where N != 0, so this kind of a send ateempt should be
> > caught and handled by TIPC. However, you've already found
> one missing
> > error check, so it's certainly worth trying it out!
> >
>
> So, I've tested out your patch, and it fixes the problem that
> was reported (no suprisingly, it was pretty clear that it
> would), it also managed to fix the access to tipc_local_nodes
> (I'd previously missed the chunk of the patch that added the
> ? to that access), so thats all great. Then I did this:
> ./tipc-config -lsr 1.1.10:eth3-1.1.17:eth2
>
> And got this:
>
> BUG: unable to handle kernel NULL pointer dereference at
> 00000000000000c4
> IP: [<ffffffffa030f6a8>] tipc_bearer_find_interface+0x1f/0x66
> [tipc] PGD 11be1a067 PUD 11c721067 PMD 0
> Oops: 0000 [#1] SMP
> last sysfs file:
> /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
> CPU 3
> Pid: 1284, comm: tipc-config Not tainted 2.6.33-rc8 #1
> 0YK962/PowerEdge SC1435
> RIP: 0010:[<ffffffffa030f6a8>] [<ffffffffa030f6a8>]
> tipc_bearer_find_interface+0x1f/0x66 [tipc]
> RSP: 0018:ffff88011c7b3a08 EFLAGS: 00010246
> RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffff88011c7b3900
> RDX: 0000000000000000 RSI: ffff88011c7b39c0 RDI: ffff88011c7b3a3c
> RBP: ffff88011c7b3a28 R08: 0000000000000032 R09: 000000000000000a
> R10: ffffffffa03258b8 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff88011c7b3a3c R14: 0000000000000040 R15: ffffffff829a89b0
> FS: 00007fd2d1c56700(0000) GS:ffff880082400000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000000c4 CR3: 000000011c7aa000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400 Process tipc-config (pid: 1284, threadinfo
> ffff88011c7b2000, task
> ffff88011cba48a0)
> Stack:
> ffff880023022260 ffff88011c7b3a38 ffff880023022260
> ffff88011c7b3aa0 <0> ffff88011c7b3a88 ffffffffa031294c
> 336874650100100a ffff880023022200 <0> 0100101100000040
> ffffff0032687465 ffff88011c7b3a88 00000000c78c4377 Call Trace:
> [<ffffffffa031294c>] link_find_link+0x40/0x9d [tipc]
> [<ffffffffa0312fce>] ? tipc_link_cmd_reset_stats+0x63/0xbb
> [tipc] [<ffffffffa0312fda>]
> tipc_link_cmd_reset_stats+0x6f/0xbb [tipc]
> [<ffffffffa031093d>] tipc_cfg_do_cmd+0x2ae/0x7d4 [tipc]
> [<ffffffffa031a506>] handle_cmd+0x68/0xba [tipc]
> [<ffffffff813d493f>] genl_rcv_msg+0x1c7/0x1eb
> [<ffffffff813d4778>] ? genl_rcv_msg+0x0/0x1eb
> [<ffffffff813d3895>] netlink_rcv_skb+0x43/0x94
> [<ffffffff813d4771>] genl_rcv+0x26/0x2d [<ffffffff813d3664>]
> netlink_unicast+0x125/0x18e [<ffffffff813d3e4e>]
> netlink_sendmsg+0x259/0x268 [<ffffffff813a57d4>]
> __sock_sendmsg+0x5e/0x69 [<ffffffff813a7f37>]
> sock_aio_write+0xc0/0xd4 [<ffffffff8107e57f>] ?
> print_lock_contention_bug+0x1b/0xe0
> [<ffffffff81114f59>] do_sync_write+0xc4/0x101
> [<ffffffff811e5cd9>] ? selinux_file_permission+0xa7/0xb3
> [<ffffffff811dbdf1>] ? security_file_permission+0x16/0x18
> [<ffffffff811154e8>] vfs_write+0xc1/0x10b
> [<ffffffff8107cb35>] ? trace_hardirqs_on_caller+0x1f/0x141
> [<ffffffff811155f2>] sys_write+0x4a/0x6e
> [<ffffffff81009b82>] system_call_fastpath+0x16/0x1b
> Code: 5f 5b 41 5c 41 5d 41 5e 41 5f c9 c3 55 48 89 e5 41 55
> 41 54 53 48 83 ec 08 0f 1f 44 00 00 48 8b 1d 86 68 01 00 45
> 31 e4 49 89 fd <83> bb c4 00 00 00 00 74 1e 48 8d 7b 5c be 3a
> 00 00 00 e8 RIP [<ffffffffa030f6a8>]
> tipc_bearer_find_interface+0x1f/0x66 [tipc] RSP <ffff88011c7b3a08>
> CR2: 00000000000000c4
> ---[ end trace a14d3b6105c45243 ]---
>
> The use of that command was arbitrary, it was just one of the
> paths from user space to one of the variables that I
> mentioned previously. And my patch wouldn't have fixed this
> either, but its illustrative of my earlier assertion, that
> theres no real synchronization between the user-space
> accessable paths in tipc and the implementation state which
> should gate access to those paths.
>
> Now we could continue to add NULL checks whereever these bugs
> pop up (and perhaps in the above case specifically that might
> be valid, I'm not sure yet), but that just feels like a
> loosing battle that we might never quite catch up with, as
> the code evolves. Seems to me like a better solution would
> be adding common gating in the netlink and send paths to
> ensure that only when the system was properly configured
> could you send various messages into the stack.
>
> I'll try to put an omibus patch together shortly.
>
> Thanks!
> Neil
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists