lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 11 Jan 2011 09:10:55 -0500
From:	Stephen Clark <sclark46@...thlink.net>
To:	Matt Carlson <mcarlson@...adcom.com>
CC:	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	Michael Chan <mchan@...adcom.com>
Subject: Re: panic in tg3 driver

On 01/10/2011 09:00 PM, Matt Carlson wrote:
> On Mon, Jan 10, 2011 at 12:04:34PM -0800, Stephen Clark wrote:
>    
>> On 01/10/2011 02:22 PM, Matt Carlson wrote:
>>      
>>> On Sun, Jan 09, 2011 at 02:30:50PM -0800, Stephen Clark wrote:
>>>
>>>        
>>>> On 01/04/2011 09:54 AM, Stephen Clark wrote:
>>>>
>>>>          
>>>>> Hello,
>>>>>
>>>>>
>>>>> The hardware is an Acrosser AR-M0898B micro box.
>>>>>    lspci
>>>>> 00:00.0 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.1 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.2 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.3 Host bridge: VIA Technologies, Inc. PT890 Host Bridge
>>>>> 00:00.4 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:00.7 Host bridge: VIA Technologies, Inc. CN700/VN800/P4M800CE/Pro
>>>>> Host Bridge
>>>>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237/VX700 PCI Bridge
>>>>> 00:0f.0 IDE interface: VIA Technologies, Inc. VT8251 Serial ATA
>>>>> Controller (rev
>>>>> 20)
>>>>> 00:0f.1 IDE interface: VIA Technologies, Inc.
>>>>> VT82C586A/B/VT82C686/A/B/VT823x/A/
>>>>> C PIPC Bus Master IDE (rev 07)
>>>>> 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
>>>>> Controller
>>>>>    (rev 91)
>>>>> 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 90)
>>>>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8251 PCI to ISA Bridge
>>>>> 00:11.7 Host bridge: VIA Technologies, Inc. VT8251 Ultra VLINK Controller
>>>>> 00:13.0 Host bridge: VIA Technologies, Inc. VT8251 Host Bridge
>>>>> 00:13.1 PCI bridge: VIA Technologies, Inc. VT8251 PCI to PCI Bridge
>>>>> 02:08.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>>>> (rev 02)
>>>>> 02:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T
>>>>> (rev 02)
>>>>> 80:00.0 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>>>> 80:00.1 PCI bridge: VIA Technologies, Inc. VT8251 PCIE Root Port
>>>>> 81:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>>>> Fast Ethernet
>>>>>    PCI Express (rev 02)
>>>>> 82:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M
>>>>> Fast Ethernet
>>>>>    PCI Express (rev 02)
>>>>>
>>>>> Kernel 2.6.36-2.el5.elrepo on an i686
>>>>>
>>>>> When I try to ifconfig either of the BCM5906M ports the system panics.
>>>>>
>>>>> Ideas, fixes ?
>>>>>
>>>>> [root@...10 ~]# modprobe tg3
>>>>> [root@...10 ~]# ifconfig eth2 2.2.2.2/24
>>>>> ------------[ cut here ]------------
>>>>> kernel BUG at drivers/net/tg3.c:4365!
>>>>> invalid opcode: 0000 [#1] PREEMPT SMP
>>>>> last sysfs file: /sys/class/net/eth3/address
>>>>> Modules linked in: tg3 xt_tcpudp ipt_LOG xt_limit xt_state
>>>>> iptable_mangle af_ke]
>>>>>
>>>>> Pid: 20303, comm: kworker/0:2 Not tainted 2.6.36-2.el5.elrepo #1
>>>>> CN700-8251/
>>>>> EIP: 0060:[<e1c62f19>] EFLAGS: 00010202 CPU: 0
>>>>> EIP is at tg3_tx_recover+0x1e/0x53 [tg3]
>>>>> EAX: deece4c0 EBX: dfa9c000 ECX: deece4c0 EDX: ffffffff
>>>>> ESI: deece4c0 EDI: deece500 EBP: c1801f38 ESP: c1801f30
>>>>>    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
>>>>> Process kworker/0:2 (pid: 20303, ti=c1801000 task=df0105d0
>>>>> task.ti=dee62000)
>>>>> Stack:
>>>>>    dfa9c000 00000000 c1801f6c e1c630be c1801f6c deece4c0 00000840 00000000
>>>>> <0>   df251cc0 00000005 00000000 df979800 deece500 deece4c0 00000040
>>>>> c1801f94
>>>>> <0>   e1c661e5 00000000 00000040 c1801f88 e09df1d2 00000000 deece500
>>>>> dfab4000
>>>>> Call Trace:
>>>>>    [<e1c630be>] ? tg3_tx+0x157/0x1a2 [tg3]
>>>>>    [<e1c661e5>] ? tg3_poll_work+0x2b/0x10b [tg3]
>>>>>    [<e09df1d2>] ? ssb_write32+0x11/0x14 [b44]
>>>>>    [<e1c662f9>] ? tg3_poll+0x34/0x9a [tg3]
>>>>>    [<c0674058>] ? net_rx_action+0x7e/0x11c
>>>>>    [<c04409c9>] ? __do_softirq+0x85/0x10c
>>>>>    [<c0440944>] ? __do_softirq+0x0/0x10c
>>>>> <IRQ>
>>>>>    [<c04404ef>] ? _local_bh_enable_ip+0x68/0x87
>>>>>    [<c044051b>] ? local_bh_enable_ip+0xd/0xf
>>>>>    [<c046593b>] ? __raw_spin_unlock_bh+0x1c/0x1e
>>>>>    [<c06fa4f2>] ? _raw_spin_unlock_bh+0xd/0xf
>>>>>    [<e1c6281f>] ? spin_unlock_bh+0xd/0xf [tg3]
>>>>>    [<e1c62cbe>] ? tg3_full_unlock+0x10/0x12 [tg3]
>>>>>    [<e1c664c7>] ? tg3_reset_task+0xd7/0xe3 [tg3]
>>>>>    [<c044ec37>] ? process_one_work+0x10b/0x1bc
>>>>>    [<e1c663f0>] ? tg3_reset_task+0x0/0xe3 [tg3]
>>>>>    [<c044fd41>] ? worker_thread+0x77/0xf9
>>>>>    [<c0453048>] ? kthread+0x60/0x65
>>>>>    [<c044fcca>] ? worker_thread+0x0/0xf9
>>>>>    [<c0452fe8>] ? kthread+0x0/0x65
>>>>>    [<c040337e>] ? kernel_thread_helper+0x6/0x10
>>>>> Code: f0 e8 88 ff ff ff 8d 65 f8 5b 5e 5d c3 55 89 e5 56 53 0f 1f 44
>>>>> 00 00 f6 8
>>>>> EIP: [<e1c62f19>] tg3_tx_recover+0x1e/0x53 [tg3] SS:ESP 0068:c1801f30
>>>>> ---[ end trace 82381e9b93e397ad ]---
>>>>> Kernel panic - not syncing: Fatal exception in interrupt
>>>>> Pid: 20303, comm: kworker/0:2 Tainted: G      D
>>>>> 2.6.36-2.el5.elrepo #1
>>>>> Call Trace:
>>>>>    [<c043b3cd>] panic+0x62/0x15d
>>>>>    [<c06fb7d1>] oops_end+0x99/0xa8
>>>>>    [<e1c62f19>] ? tg3_tx_recover+0x1e/0x53 [tg3]
>>>>>    [<c0405a62>] die+0x58/0x5e
>>>>>
>>>>> Thanks,
>>>>> Steve
>>>>>
>>>>>
>>>>>            
>>>> Additonal info I compiled the latest kernel 2.6.37-rc8+ and still have the problem.
>>>> Also boot with noapic I see this in the dmesg log and interrupts are increasing
>>>> like crazy:
>>>> tg3.c:v3.115 (October 14, 2010)
>>>> tg3 0000:81:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
>>>> tg3 0000:81:00.0: setting latency timer to 64
>>>> tg3 0000:81:00.0: PCI: Disallowing DAC for device
>>>> tg3 0000:81:00.0: eth2: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>>>> ress 00:02:b6:36:d1:39
>>>> tg3 0000:81:00.0: eth2: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>>>> [0])
>>>> tg3 0000:81:00.0: eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>>>> tg3 0000:81:00.0: eth2: dma_rwctrl[76180000] dma_mask[32-bit]
>>>> tg3 0000:82:00.0: PCI INT A ->   Link[LNKA] ->   GSI 10 (level, low) ->   IRQ 10
>>>> tg3 0000:82:00.0: setting latency timer to 64
>>>> tg3 0000:82:00.0: PCI: Disallowing DAC for device
>>>> tg3 0000:82:00.0: eth3: Tigon3 [partno(BCM95906) rev c002] (PCI Express) MAC add
>>>> ress 00:02:b6:36:d1:3a
>>>> tg3 0000:82:00.0: eth3: attached PHY is 5906 (10/100Base-TX Ethernet) (WireSpeed
>>>> [0])
>>>> tg3 0000:82:00.0: eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]
>>>> tg3 0000:82:00.0: eth3: dma_rwctrl[76180000] dma_mask[32-bit]
>>>> tg3 0000:81:00.0: irq 40 for MSI/MSI-X
>>>> tg3 0000:81:00.0: eth2: No interrupt was generated using MSI. Switching to INTx
>>>> mode. Please report this failure to the PCI maintainer and include system chipse
>>>> t information
>>>> ADDRCONF(NETDEV_UP): eth2: link is not ready
>>>> [root@...10 ~]# cat /proc/interrupts
>>>>               CPU0
>>>>      0:        162    XT-PIC-XT-PIC    timer
>>>>      1:          2    XT-PIC-XT-PIC    i8042
>>>>      2:          0    XT-PIC-XT-PIC    cascade
>>>>      3:          1    XT-PIC-XT-PIC
>>>>      4:       4863    XT-PIC-XT-PIC    serial
>>>>      6:          2    XT-PIC-XT-PIC    floppy
>>>>      7:          5    XT-PIC-XT-PIC    ehci_hcd:usb1, uhci_hcd:usb3
>>>>      8:          0    XT-PIC-XT-PIC    rtc0
>>>>      9:          0    XT-PIC-XT-PIC    acpi
>>>>     10:    2334234    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>>
>>>> [root@...10 ~]# cat /proc/interrupts |grep eth2
>>>>     10:   18388914    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>> [root@...10 ~]# cat /proc/interrupts |grep eth2
>>>>     10:   18901627    XT-PIC-XT-PIC    uhci_hcd:usb2, eth0, eth2
>>>>
>>>> -- 
>>>>
>>>> "They that give up essential liberty to obtain temporary safety,
>>>> deserve neither liberty nor safety."  (Ben Franklin)
>>>>
>>>> "The course of history shows that as a government grows, liberty
>>>> decreases."  (Thomas Jefferson)
>>>>
>>>>          
>>> I think drivers/net/tg3.c:4365 is at the line that reads
>>> "spin_lock(&tp->lock);" in tg3_tx_recover.  Can you verify?
>>>
>>>
>>>        
>>
>>           tg3_readphy(tp, MII_TG3_DSP_RW_PORT,&phy2);
>>
>> in static void tg3_serdes_parallel_detect(struct tg3 *tp)
>>
>> The driver version is:
>> #define DRV_MODULE_NAME        "tg3"
>> #define TG3_MAJ_NUM            3
>> #define TG3_MIN_NUM            115
>>      
>
> That doesn't look right.  The line number I quoted came from the kernel
> panic output from 2.6.36-2.el5.elrepo.  I'm guessing you quoted me the
> sources from the tg3.c file in 2.6.37-rc8+.  If you don't have the
> 2.6.36-2.el5.elrepo sources readily available, can you give me the line
> the kernel panic specifies from the tg3.c file from your 2.6.37-rc8+
> sources?
>
>    
Oops - You are correct. The problem is most of the time I don't get a 
panic on the
screen the box simply reboots.

I'll see if I can get the 2.6.36-2 sources - though they are suppose to 
be the virgin
kernel.org sources simply recompiled for Centos.

static void tg3_tx_recover(struct tg3 *tp)
{
     BUG_ON((tp->tg3_flags & TG3_FLAG_MBOX_WRITE_REORDER) ||
4365:           tp->write32_tx_mbox == tg3_write_indirect_mbox);


> It looks like there are a lot of devices on IRQ 10.  Does the interrupt
> count drop if you bring down eth0 (which I'm guessing is the b44 device)?
>    
This happens when I boot with noapic. Which I only did as a test. With 
the noapic option
the system doesn't panic - but gets all these extra interrupts as soon 
as I ifconfig one of
the 5906 ports.


> Can you tell me if you saw the following message in the syslogs?
>
> "The system may be re-ordering memory-mapped I/O cycles to the network
>   device, attempting to recover.  Please report the problem to the driver
>   maintainer and include system chipset information."
>
>    
Couldn't find this in the messages file.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ