[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <4B59E7EB.3050605@majjas.com>
Date: Fri, 22 Jan 2010 13:01:15 -0500
From: Michael Breuer <mbreuer@...jas.com>
To: Jarek Poplawski <jarkao2@...il.com>
Cc: David Miller <davem@...emloft.net>,
Stephen Hemminger <shemminger@...ux-foundation.org>,
akpm@...ux-foundation.org, flyboy@...il.com,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
Michael Chan <mchan@...adcom.com>,
Don Fry <pcnet32@...izon.net>,
Francois Romieu <romieu@...zoreil.com>,
Matt Carlson <mcarlson@...adcom.com>
Subject: Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at
lib/dma-debug.c:902 check_sync)
Kernel 2.6.32.4 (git) with the following patches applied:
af_packet.c (tpacket_snd version 3)
sky2.c pskb_may_pull
sky2 fix WARNING at lib/dma-debug.c check_sync
Running with CONFIG_DMAR=n, system is stable.
Running with the exact same source but CONFIG_DMAR=y I get the WARNING
(see below) after about 36 hours of uptime (has varied from about 24 to
about 48):
Smolt profile:
http://smolt.fedoraproject.org/show?uuid=pub_bb05c701-1e47-4b3c-9fab-54f520f39d79+
I'm also attaching dmesg.old (dmesg from the crash).
Subsequent to this the system watchdog reboots the system (it's hung).
Of interest: each and every time this has happened the system was under
heavy RX load (win7 backup to a cifs share hosted on this server). Also,
there is always a dhcp exchange of some sort preceding the event.
It is possible that the event is re creatable without DMAR enabled, but
I have been unsuccessful in doing so.
Jan 22 05:38:36 mail dhcpd: DHCPREQUEST for 10.0.0.54 from
00:1b:78:c8:2b:8e (HPC82B8D) via eth0
Jan 22 05:38:36 mail dhcpd: DHCPACK on 10.0.0.54 to 00:1b:78:c8:2b:8e
(HPC82B8D) via eth0
Jan 22 05:38:41 mail kernel: DRHD: handling fault status reg 2
Jan 22 05:38:41 mail kernel: DMAR:[DMA Read] Request device [06:00.0]
fault addr ffdfdd9fe000
Jan 22 05:38:41 mail kernel: DMAR:[fault reason 06] PTE Read access is
not set
Jan 22 05:38:41 mail kernel: sky2 0000:06:00.0: error interrupt
status=0xc0000000
Jan 22 05:38:41 mail kernel: sky2 0000:06:00.0: PCI hardware error (0x2010)
Jan 22 05:39:18 mail kernel: ------------[ cut here ]------------
Jan 22 05:39:18 mail kernel: WARNING: at net/sched/sch_generic.c:261
dev_watchdog+0xf3/0x164()
Jan 22 05:39:18 mail kernel: Hardware name: System Product Name
Jan 22 05:39:18 mail kernel: NETDEV WATCHDOG: eth0 (sky2): transmit
queue 0 timed out
Jan 22 05:39:18 mail kernel: Modules linked in: cpufreq_stats
ip6table_mangle ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat
nf_nat iptable_mangle iptable_raw appletalk psnap llc nfsd lockd nfs_acl
auth_rpcgss exportfs hwmon_vid coretemp sunrpc acpi_cpufreq sit tunnel4
ipt_LOG nf_conntrack_netbios_ns nf_conntrack_ftp nf_conntrack_ipv6
xt_multiport xt_DSCP xt_dscp xt_MARK ipv6 dm_multipath kvm_intel kvm
snd_hda_codec_analog snd_ens1371 gameport snd_rawmidi snd_hda_intel
snd_ac97_codec snd_hda_codec snd_hwdep ac97_bus snd_seq snd_seq_device
firewire_ohci snd_pcm firewire_core crc_itu_t snd_timer snd
gspca_spca505 gspca_main i2c_i801 videodev v4l1_compat sky2 soundcore
v4l2_compat_ioctl32 wmi snd_page_alloc asus_atk0110 hwmon pcspkr
iTCO_wdt iTCO_vendor_support fbcon tileblit font bitblit softcursor
raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy
async_tx raid1 ata_generic pata_acpi pata_marvell nouveau ttm
drm_kms_helper drm agpgart fb i2c_algo_bit cfbcopyarea i2c_core
cfbimgblt cfb
Jan 22 05:39:18 mail kernel: fillrect [last unloaded: microcode]
Jan 22 05:39:18 mail kernel: Pid: 0, comm: swapper Tainted: G W
2.6.32.4MMAPDMARAF3SKY2PSKBMAYPULL-00912-g914160d-dirty #6
Jan 22 05:39:18 mail kernel: Call Trace:
Jan 22 05:39:18 mail kernel: <IRQ> [<ffffffff810536ee>]
warn_slowpath_common+0x7c/0x94
Jan 22 05:39:18 mail kernel: [<ffffffff8105375d>]
warn_slowpath_fmt+0x41/0x43
Jan 22 05:39:18 mail kernel: [<ffffffff813e3b6b>] ? netif_tx_lock+0x44/0x6c
Jan 22 05:39:18 mail kernel: [<ffffffff813e3cd3>] dev_watchdog+0xf3/0x164
Jan 22 05:39:18 mail kernel: [<ffffffff8105f3f4>] ? cascade+0x6a/0x84
Jan 22 05:39:18 mail kernel: [<ffffffff8106323f>]
run_timer_softirq+0x1c8/0x270
Jan 22 05:39:18 mail kernel: [<ffffffff8105af0f>] __do_softirq+0xf8/0x1cd
Jan 22 05:39:18 mail kernel: [<ffffffff8107f0ab>] ?
tick_program_event+0x2a/0x2c
Jan 22 05:39:18 mail kernel: [<ffffffff81012e1c>] call_softirq+0x1c/0x30
Jan 22 05:39:18 mail kernel: [<ffffffff810143a3>] do_softirq+0x4b/0xa6
Jan 22 05:39:18 mail kernel: [<ffffffff8105aaef>] irq_exit+0x4a/0x8c
Jan 22 05:39:18 mail kernel: [<ffffffff81470612>]
smp_apic_timer_interrupt+0x86/0x94
Jan 22 05:39:18 mail kernel: [<ffffffff810127e3>]
apic_timer_interrupt+0x13/0x20
Jan 22 05:39:18 mail kernel: <EOI> [<ffffffff812c729a>] ?
acpi_idle_enter_bm+0x256/0x28a
Jan 22 05:39:18 mail kernel: [<ffffffff812c7293>] ?
acpi_idle_enter_bm+0x24f/0x28a
Jan 22 05:39:18 mail kernel: [<ffffffff813a6c3c>] ?
cpuidle_idle_call+0x9e/0xfa
Jan 22 05:39:18 mail kernel: [<ffffffff81010c90>] ? cpu_idle+0xb4/0xf6
Jan 22 05:39:18 mail kernel: [<ffffffff81465ba5>] ?
start_secondary+0x201/0x242
Jan 22 05:39:18 mail kernel: ---[ end trace 57f7151f6a5def07 ]---
Jan 22 05:39:18 mail kernel: sky2 eth0: tx timeout
Jan 22 05:39:18 mail kernel: sky2 eth0: transmit ring 76 .. 35 report=76
done=76
Jan 22 05:39:18 mail kernel: sky2 eth0: disabling interface
Jan 22 05:39:18 mail kernel: sky2 eth0: enabling interface
Jan 22 05:39:21 mail kernel: sky2 eth0: Link is up at 1000 Mbps, full
duplex, flow control both
Jan 22 05:40:06 mail kernel: sky2 eth0: tx timeout
Jan 22 05:40:06 mail kernel: sky2 eth0: transmit ring 3 .. 90 report=3
done=3
Jan 22 05:40:06 mail kernel: sky2 eth0: disabling interface
Jan 22 05:40:06 mail kernel: sky2 eth0: enabling interface
Jan 22 05:40:09 mail kernel: sky2 eth0: Link is up at 1000 Mbps, full
duplex, flow control both
Jan 22 05:40:30 mail abrt: Kerneloops: Reported 1 kernel oopses to Abrt
Jan 22 05:40:30 mail abrtd: Directory 'kerneloops-1264156830-1' creation
detected
Jan 22 05:40:30 mail abrtd: Can't open file
'/var/cache/abrt/kerneloops-1264156830-1/cmdline'
Jan 22 05:40:30 mail abrtd: Corrupted or bad crash, deleting
Jan 22 05:40:33 mail dhcpd: DHCPINFORM from 10.0.0.11 via eth0
Jan 22 05:40:33 mail dhcpd: DHCPACK to 10.0.0.11 (00:1a:92:8d:30:81) via
eth0
Jan 22 05:40:36 mail dhcpd: DHCPINFORM from 10.0.0.11 via eth0
Jan 22 05:40:36 mail dhcpd: DHCPACK to 10.0.0.11 (00:1a:92:8d:30:81) via
eth0
Jan 22 05:40:50 mail named[11245]: error (connection refused) resolving
'173-212-207-86.hostnoc.net/A/IN': 2607:f878:0:3::12#53
Jan 22 05:40:50 mail named[11245]: error (connection refused) resolving
'NS1.HOSTNOC.NET/AAAA/IN': 2607:f878:0:3::10#53
<watchdog reboot>
View attachment "dmesg.old" of type "text/plain" (72486 bytes)
Powered by blists - more mailing lists