[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93d1fdd10903091046w2d426226sfcb2a0d52c94a114@mail.gmail.com>
Date: Mon, 9 Mar 2009 12:46:31 -0500
From: Ron Yorgason <yorgasor@...il.com>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: netdev@...r.kernel.org
Subject: Re: Kernel Oops in UDP w/ ARM architecture
We're using the fec driver, found in drivers/net/fec.c. I modified
this driver slightly to get the MAC address from the redboot
configuration stored in flash memory, but it's otherwise untouched. I
can send my version of the file if that would help.
--Ron
On Mon, Mar 9, 2009 at 12:16 PM, Eric Dumazet <dada1@...mosbay.com> wrote:
> Ron Yorgason a écrit :
> - Show quoted text -
>> I'm working on an embedded video streaming application using gstreamer
>> over RTP/UDP on a Freescale iMX27 ARM platform. I have one board
>> doing the video capture and compression, and streaming it across the
>> network to another board which does the decoding and display. I'm
>> stuck right now with a kernel oops we're getting. It usually occurs
>> within 2-6 hours, but sometimes it takes longer for it to happen. I
>> believe it always dies with the same address in the failure.
>>
>> I'm using a 2.6.19.2 kernel release. I don't know if this problem has
>> already been found and fixed in a future release (I didn't see any
>> mention of it in the changelogs of the next few releases), but this is
>> a customized kernel and I don't know how feasible it would be to port
>> all the changes to a newer kernel. We haven't touched the networking
>> stack, so it's most likely this bug is in the stock release.
>>
>> Unable to handle kernel paging request at virtual address c6f9202a
>> pgd = c6d7c000
>> [c6f9202a] *pgd=a6e0041e(bad)
>> Internal error: Oops: 1 [#3]
>> Modules linked in:
>> CPU: 0
>> PC is at udp_recvmsg+0x184/0x21c
>> LR is at 0xf2799669
>> pc : [<c024a3e0>] lr : [<f2799669>] Not tainted
>> sp : c6f9fd48 ip : 00000000 fp : c6f9fd80
>> r10: c6f9fea0 r9 : 00000000 r8 : 00000400
>> r7 : 00000400 r6 : c7a52200 r5 : c6f9ff20 r4 : c6291780
>> r3 : c6f9201e r2 : 00000000 r1 : 00000008 r0 : c6f9fea8
>> Flags: NzCv IRQs on FIQs on Mode SVC_32 Segment user
>> Control: 5317F
>> Table: A6D7C000 DAC: 00000015
>> Process gst-launch-0.10 (pid: 18165, stack limit = 0xc6f9e250)
>> Stack: (0xc6f9fd48 to 0xc6fa0000)
>> fd40: 00000001 00000000 00000000 00000000 c02fbb80 c6f9ff20
>> fd60: c6f9ff20 00000400 00000000 00000000 00000000 c6f9fda8 c6f9fd84 c0207468
>> fd80: c024a26c 00000000 00000000 c6f9fd90 00000010 c6f9fdb0 c7c4fac0 c6f9fe9c
>> fda0: c6f9fdac c0205ae0 c020742c 00000000 c02e06c8 00000001 00000000 00000001
>> fdc0: ffffffff 00000000 00000000 00000000 00000000 00000000 c7c4fac0 00000000
>> fde0: 00000000 c6c5d720 c7c4fac0 c006a3a4 c6f9fdf0 c6f9fdf0 c6f9e000 ffffffff
>> fe00: c6f9fe34 c7176b60 c7176b90 8511a8c0 c6f9fea8 00000408 c6f9fe44 c6f9fe28
>> fe20: c0209ff8 00000001 00000004 40ee9e04 40ee9e04 00000000 00000000 00000000
>> fe40: 00000400 c759bba0 00000000 00000000 c6f9ff20 00000500 00000000 00000000
>> fe60: 00000400 00000000 00000000 c03714a4 c6f9fef8 00000000 00000400 00093800
>> fe80: c6f9fea0 c76d45a0 c6f9e000 40ee9e84 c6f9ff70 c6f9fea0 c0206990 c0205a30
>> fea0: 03080002 c005d660 a0000093 00043887 c7d6a000 000002c0 c7d6a2c0 60000013
>> fec0: c6f9fedc c6f9fed0 c005dbc0 c005da94 c6f9ff34 c6f9fee0 c018455c c005db90
>> fee0: 485a7d2d 00046731 00000400 c6f9ff10 c6f9fefc c024a130 c0059780 c76d45a0
>> ff00: 0000541b c6f9ff20 c6f9ff14 c024ff7c c024a0a8 c6f9ff3c c6f9ff24 c02052cc
>> ff20: c6f9fea0 00000080 c6f9ff3c 00000001 00000000 00000000 c00a8cf8 00093c00
>> ff40: 00000000 00000001 40ee9e9c 0000000c 00093800 00000400 00000066 c0038f84
>> ff60: 404fa2f0 c6f9ffa4 c6f9ff74 c0206e9c c0206908 40ee9e84 40ee9ea0 0000000a
>> ff80: 00093800 00000400 00000000 40ee9e84 40ee9ea0 000001c4 00000000 c6f9ffa8
>> ffa0: c0038de0 c0206d10 000001c4 00093800 0000000c 40ee9dd4 40eea56c 00000002
>> ffc0: 000001c4 00093800 00000400 0000000a 40ee9ea0 40ee9e84 404fa2f0 000350d0
>> ffe0: 00000000 40ee9dd0 4020fe74 40210808 80000010 0000000c 033a0000 8c020000
>> Backtrace:
>> [<c024a25c>] (udp_recvmsg+0x0/0x21c) from [<c0207468>] (sock_common_recvmsg+0x4)
>> [<c020741c>] (sock_common_recvmsg+0x0/0x60) from [<c0205ae0>] (sock_recvmsg+0xc)
>> r5 = C7C4FAC0 r4 = C6F9FDB0
>> [<c0205a20>] (sock_recvmsg+0x0/0xec) from [<c0206990>] (sys_recvfrom+0x98/0xf0)
>> [<c02068f8>] (sys_recvfrom+0x0/0xf0) from [<c0206e9c>] (sys_socketcall+0x19c/0x)
>> [<c0206d00>] (sys_socketcall+0x0/0x1f0) from [<c0038de0>] (ret_fast_syscall+0x0)
>> r4 = 000001C4
>> Code: e28a0008 e1d330b0 e3a01008 e1ca30b2 (e5943020)
>>
>>
>> I did the disassembly to find out exactly where the failure occurs. I
>> put an asterisk by the address offset mentioned in the oops, but I
>> believe it's the next line down where it references the address where
>> it chokes.
>
> Yes I agree (R3 + offset) chokes, not (r4 + offset)
> - Show quoted text -
>>
>> 00001ae4 <udp_recvmsg>:
>> 1ae4: e1a0c00d mov ip, sp
>> 1ae8: e92ddff0 stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
>> 1aec: e24cb004 sub fp, ip, #4 ; 0x4
>> 1af0: e24dd010 sub sp, sp, #16 ; 0x10
>> 1af4: e59b000c ldr r0, [fp, #12]
>> 1af8: e59b9008 ldr r9, [fp, #8]
>> 1afc: e3500000 cmp r0, #0 ; 0x0
>> 1b00: e1a08003 mov r8, r3
>> 1b04: 13a03010 movne r3, #16 ; 0x10
>> 1b08: e592a000 ldr sl, [r2]
>> 1b0c: 15803000 strne r3, [r0]
>> 1b10: e3190a02 tst r9, #8192 ; 0x2000
>> 1b14: e1a05002 mov r5, r2
>> 1b18: e1a06001 mov r6, r1
>> 1b1c: 0a000004 beq 1b34 <udp_recvmsg+0x50>
>> 1b20: e1a00001 mov r0, r1
>> 1b24: e1a01002 mov r1, r2
>> 1b28: e1a02008 mov r2, r8
>> 1b2c: ebfffffe bl 0 <ip_recv_error>
>> 1b30: ea00006e b 1cf0 <udp_recvmsg+0x20c>
>> 1b34: e1a01009 mov r1, r9
>> 1b38: e59b2004 ldr r2, [fp, #4]
>> 1b3c: e24b302c sub r3, fp, #44 ; 0x2c
>> 1b40: e1a00006 mov r0, r6
>> 1b44: ebfffffe bl 0 <skb_recv_datagram>
>> 1b48: e2504000 subs r4, r0, #0 ; 0x0
>> 1b4c: e3a01008 mov r1, #8 ; 0x8
>> 1b50: 0a000057 beq 1cb4 <udp_recvmsg+0x1d0>
>> 1b54: e5943060 ldr r3, [r4, #96]
>> 1b58: e2437008 sub r7, r3, #8 ; 0x8
>> 1b5c: e1570008 cmp r7, r8
>> 1b60: 85953018 ldrhi r3, [r5, #24]
>> 1b64: 81a07008 movhi r7, r8
>> 1b68: 83833020 orrhi r3, r3, #32 ; 0x20
>> 1b6c: 85853018 strhi r3, [r5, #24]
>> 1b70: e5d43074 ldrb r3, [r4, #116]
>> 1b74: e203300c and r3, r3, #12 ; 0xc
>> 1b78: e3530008 cmp r3, #8 ; 0x8
>> 1b7c: 01a01003 moveq r1, r3
>> 1b80: 0a000007 beq 1ba4 <udp_recvmsg+0xc0>
>> 1b84: e5953018 ldr r3, [r5, #24]
>> 1b88: e3130020 tst r3, #32 ; 0x20
>> 1b8c: 0a000009 beq 1bb8 <udp_recvmsg+0xd4>
>> 1b90: ebfffffe bl 0 <__skb_checksum_complete>
>> 1b94: e3500000 cmp r0, #0 ; 0x0
>> 1b98: 1a000047 bne 1cbc <udp_recvmsg+0x1d8>
>> 1b9c: e1a00004 mov r0, r4
>> 1ba0: e3a01008 mov r1, #8 ; 0x8
>> 1ba4: e5952008 ldr r2, [r5, #8]
>> 1ba8: e1a03007 mov r3, r7
>> 1bac: ebfffffe bl 0 <skb_copy_datagram_iovec>
>> 1bb0: e50b002c str r0, [fp, #-44]
>> 1bb4: ea000004 b 1bcc <udp_recvmsg+0xe8>
>> 1bb8: e5952008 ldr r2, [r5, #8]
>> 1bbc: ebfffffe bl 0 <skb_copy_and_csum_datagram_iovec>
>> 1bc0: e3700016 cmn r0, #22 ; 0x16
>> 1bc4: e50b002c str r0, [fp, #-44]
>> 1bc8: 0a00003b beq 1cbc <udp_recvmsg+0x1d8>
>> 1bcc: e51b302c ldr r3, [fp, #-44]
>> 1bd0: e3530000 cmp r3, #0 ; 0x0
>> 1bd4: 1a000033 bne 1ca8 <udp_recvmsg+0x1c4>
>> 1bd8: e594100c ldr r1, [r4, #12]
>> 1bdc: e5962094 ldr r2, [r6, #148]
>> 1be0: e50b1034 str r1, [fp, #-52]
>> 1be4: e5943010 ldr r3, [r4, #16]
>> 1be8: e3120b02 tst r2, #2048 ; 0x800
>> 1bec: e50b3030 str r3, [fp, #-48]
>> 1bf0: 0a00000f beq 1c34 <udp_recvmsg+0x150>
>> 1bf4: e3510000 cmp r1, #0 ; 0x0
>> 1bf8: 1a000001 bne 1c04 <udp_recvmsg+0x120>
>> 1bfc: e24b0034 sub r0, fp, #52 ; 0x34
>> 1c00: ebfffffe bl 0 <do_gettimeofday>
>> 1c04: e51b3034 ldr r3, [fp, #-52]
>> 1c08: e24bc034 sub ip, fp, #52 ; 0x34
>> 1c0c: e584300c str r3, [r4, #12]
>> 1c10: e51b3030 ldr r3, [fp, #-48]
>> 1c14: e1a00005 mov r0, r5
>> 1c18: e5843010 str r3, [r4, #16]
>> 1c1c: e3a01001 mov r1, #1 ; 0x1
>> 1c20: e3a0201d mov r2, #29 ; 0x1d
>> 1c24: e3a03008 mov r3, #8 ; 0x8
>> 1c28: e58dc000 str ip, [sp]
>> 1c2c: ebfffffe bl 0 <put_cmsg>
>> 1c30: ea000003 b 1c44 <udp_recvmsg+0x160>
>> 1c34: e24b2034 sub r2, fp, #52 ; 0x34
>> 1c38: e892000c ldmia r2, {r2, r3}
>> 1c3c: e58620f8 str r2, [r6, #248]
>> 1c40: e58630fc str r3, [r6, #252]
>> 1c44: e35a0000 cmp sl, #0 ; 0x0
>>
>>
>> 1c48: 0a00000a beq 1c78 <udp_recvmsg+0x194>
>> 1c4c: e3a03002 mov r3, #2 ; 0x2
>> 1c50: e1ca30b0 strh r3, [sl]
>> 1c54: e594301c ldr r3, [r4, #28]
>> 1c58: e28a0008 add r0, sl, #8 ; 0x8
>> 1c5c: e1d330b0 ldrh r3, [r3]
>> 1c60: e3a01008 mov r1, #8 ; 0x8
>> 1c64: e1ca30b2 strh r3, [sl, #2]
>> * 1c68: e5943020 ldr r3, [r4, #32]
>> 1c6c: e593300c ldr r3, [r3, #12]
>> 1c70: e58a3004 str r3, [sl, #4]
>> 1c74: ebfffffe bl 0 <__memzero>
>> 1c78: e59f3078 ldr r3, [pc, #120] ; 1cf8 <.text+0x1cf8>
>> 1c7c: e19630b3 ldrh r3, [r6, r3]
>>
>>
>> 1c80: e3530000 cmp r3, #0 ; 0x0
>> 1c84: 0a000002 beq 1c94 <udp_recvmsg+0x1b0>
>> 1c88: e1a00005 mov r0, r5
>> 1c8c: e1a01004 mov r1, r4
>> 1c90: ebfffffe bl 0 <ip_cmsg_recv>
>> 1c94: e3190020 tst r9, #32 ; 0x20
>> 1c98: e50b702c str r7, [fp, #-44]
>> 1c9c: 15943060 ldrne r3, [r4, #96]
>> 1ca0: 12433008 subne r3, r3, #8 ; 0x8
>> 1ca4: 150b302c strne r3, [fp, #-44]
>> 1ca8: e1a00006 mov r0, r6
>> 1cac: e1a01004 mov r1, r4
>> 1cb0: ebfffffe bl 0 <skb_free_datagram>
>> 1cb4: e51b002c ldr r0, [fp, #-44]
>> 1cb8: ea00000c b 1cf0 <udp_recvmsg+0x20c>
>> 1cbc: e59f3038 ldr r3, [pc, #56] ; 1cfc <.text+0x1cfc>
>> 1cc0: e1a02009 mov r2, r9
>> 1cc4: e593c000 ldr ip, [r3]
>> 1cc8: e1a01004 mov r1, r4
>> 1ccc: e59c300c ldr r3, [ip, #12]
>> 1cd0: e1a00006 mov r0, r6
>> 1cd4: e2833001 add r3, r3, #1 ; 0x1
>> 1cd8: e58c300c str r3, [ip, #12]
>> 1cdc: ebfffffe bl 0 <skb_kill_datagram>
>> 1ce0: e59b2004 ldr r2, [fp, #4]
>> 1ce4: e3520000 cmp r2, #0 ; 0x0
>> 1ce8: 0affff91 beq 1b34 <udp_recvmsg+0x50>
>> 1cec: e3e0000a mvn r0, #10 ; 0xa
>> 1cf0: e24bd028 sub sp, fp, #40 ; 0x28
>> 1cf4: e89daff0 ldmia sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}
>> 1cf8: 00000146 andeq r0, r0, r6, asr #2
>> 1cfc: 00000000 andeq r0, r0, r0
>>
>>
>> In the udp_recvmsg() function, the fault occurs in this code:
>> /* Copy the address. */
>> if (sin)
>> {
>> sin->sin_family = AF_INET;
>> sin->sin_port = skb->h.uh->source;
>> sin->sin_addr.s_addr = skb->nh.iph->saddr; // <- failure accessing
>> memory at saddr
>> memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
>> }
>>
>>
>> After reviewing the assembly and the source code, it looks like the
>> address "c6f9202a" is where it thinks saddr should be. Ideally, I'd
>
> This address is not aligned to a word (multiple of 4), which seems strange...
>
> Maybe ARM doesnt handle unaligned accesses ?
>
>
> 1c48: 0a00000a beq 1c78 <udp_recvmsg+0x194>
> 1c4c: e3a03002 mov r3, #2 ; 0x2
> 1c50: e1ca30b0 strh r3, [sl]
> 1c54: e594301c ldr r3, [r4, #28] skb->h.uh (udp hdr) OK
> 1c58: e28a0008 add r0, sl, #8 ; 0x8
> 1c5c: e1d330b0 ldrh r3, [r3]
> 1c60: e3a01008 mov r1, #8 ; 0x8
> 1c64: e1ca30b2 strh r3, [sl, #2]
> * 1c68: e5943020 ldr r3, [r4, #32] skb->nh.iph (IP header) OK
> 1c6c: e593300c ldr r3, [r3, #12] but (R+12) is unaligned
> 1c70: e58a3004 str r3, [sl, #4]
> 1c74: ebfffffe bl 0 <__memzero>
> 1c78: e59f3078 ldr r3, [pc, #120] ; 1cf8 <.text+0x1cf8>
> 1c7c: e19630b3 ldrh r3, [r6, r3]
>
> What is your NIC driver ?
> - Show quoted text -
>> like to figure out how to solve the problem. From ifconfig, I'm
>> finding a few errors with overruns, so maybe the queue is wrapping
>> around and clobbering the sk_buffs.
>>
>> eth0 Link encap:Ethernet HWaddr 00:00:D0:D0:DA:D2
>> inet addr:192.168.17.133 Bcast:192.168.17.255 Mask:255.255.255.0
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> RX packets:440979642 errors:8 dropped:0 overruns:8 frame:0
>> TX packets:601998 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:2838009823 (2.6 GiB) TX bytes:155320893 (148.1 MiB)
>> Base address:0xb000
>>
>> I'd also be willing to settle for a short term solution of finding a
>> way to test whether it's safe to dereference that pointer, and
>> skipping that sk_buff if it's bad.
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists