[<prev] [next>] [day] [month] [year] [list]
Message-ID: <543DEA0A.3080609@oracle.com>
Date: Wed, 15 Oct 2014 11:29:14 +0800
From: Joe Jin <joe.jin@...cle.com>
To: "zheng.li" <zheng.x.li@...cle.com>,
Sony Chacko <sony.chacko@...gic.com>,
Dept-HSGLinuxNICDev@...gic.com
CC: Michael Chan <mchan@...adcom.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Unable to handle kernel NULL pointer dereference at 0000000000000088
RIP: [<ffffffff881ba167>] :bnx2:bnx2_poll_work+0xc7/0x1253
Copy to new maintainer from QLogic and netdev.
Thanks,
Joe
On 10/11/14 17:22, zheng.li wrote:
> Hi Michael,
> I encounter a null pointer in bnx2_poll_work,
> after analyzed the vmcore found:
> tx_prod = 27708 and hw_cons = bnx2_get_hw_tx_cons(bnapi) = 27722;
> hw_cons > tx_prod;
> so the root cause is mostly HW sent data count is larger than stack
> provide in bnx2_start_xmit to cause memory override, normally HW just
> can sent data maximum is tx_prod, but don't know why HW sent data more
> than tx_prod 14 data.
>
> Can you help to look at the issue? we encounter several times.
> bnx2 driver is 2.1.11,
> #define DRV_MODULE_VERSION "2.1.11"
> #define DRV_MODULE_RELDATE "July 20, 2011"
> Kernel version is : 2.6.18-371.1.2.0.1
>
>
> vmcore show infor is below:
>
> crash64> bnx2 ffff81122c650500
> struct bnx2 {
> regview = 0xffffc200100e0000,
> dev = 0xffff81122c650000,
> pdev = 0xffff81122f0c9000,
> intr_sem = {
> counter = 0
> },
> flags = 22404,
> bnx2_napi = {{
> dummy_netdev = 0xffff81242ae3e800,
> bp = 0xffff81122c650500,
> status_blk = {
> msi = 0xffff81122391b000,
> msix = 0xffff81122391b000
> },
> hw_tx_cons_ptr = 0xffff81122391b00a,
> hw_rx_cons_ptr = 0xffff81122391b012,
> last_status_idx = 65048,
> int_num = 0,
> cnic_tag = 0,
> cnic_present = 0,
> rx_ring = {
> rx_prod_bseq = 1540471240,
> rx_prod = 21188,
> rx_cons = 20932,
> rx_bidx_addr = 65540,
> rx_bseq_addr = 65544,
> rx_pg_bidx_addr = 65604,
> rx_pg_prod = 0,
> rx_pg_cons = 0,
> rx_buf_ring = 0xffffc20014a7b000,
> rx_desc_ring = {0xffff81122b8a0000, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0},
> rx_pg_ring = 0x0,
> rx_pg_desc_ring = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
> rx_desc_mapping = {78039875584, 0, 0, 0, 0, 0, 0, 0},
> rx_pg_desc_mapping = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
> },
> tx_ring = {
> tx_prod_bseq = 3702584042,
> tx_prod = 27708,
> tx_bidx_addr = 69768,
> tx_bseq_addr = 69776,
> tx_desc_ring = 0xffff811227890000,
> tx_buf_ring = 0xffff81242f510000,
> tx_cons = 27603,
> hw_tx_cons = 27603,
> tx_desc_mapping = 77972701184
> }
> },
>
> crash64> rd 0xffff81122391b00a
> ffff81122391b00a: 0000000000006c4a
> hw_cons = 6c4a = 27722;
>
> usr/src/debug/kernel-2.6.18/linux-2.6.18-371.1.2.0.1.el5.x86_64/include/linux/skbuff.h:
> 921
> 0xffffffff881ba167 <bnx2_poll_work+199>:921>: mov 0x88(%r13),%edx
> R13: 0000000000000000
> R13 is skb which is NULL at that moment.
>
> Had refer https://access.redhat.com/solutions/341183
> and
> http://kernel.opensuse.org/cgit/kernel/commit/?id=c1f5163de417dab01fa9daaf09a74bbb19303f3c
> but can't exactly know which case our bug hit.
>
> Thanks,
> James Li
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists