[<prev] [next>] [day] [month] [year] [list]
Message-ID: <005401d0585c$6c7707c0$45651740$@lucidpixels.com>
Date: Fri, 6 Mar 2015 17:25:17 -0500
From: "Justin Piszcz" <jpiszcz@...idpixels.com>
To: <linux-kernel@...r.kernel.org>
Subject: RE: 3.19 kernel: BUG: unable to handle kernel NULL pointer dereference [SOLVED (use 3.15 for now)]
> -----Original Message-----
> From: Justin Piszcz [mailto:jpiszcz@...idpixels.com]
> Sent: Saturday, February 28, 2015 5:57 PM
> To: linux-kernel@...r.kernel.org
> Subject: RE: 3.19 kernel: BUG: unable to handle kernel NULL pointer
> dereference
>
Removing the card did not fix the issue.
I've gone back to 3.15 in the meantime and was able to process 20TB of test
data in a little over 10 hours--without any crashes with the following
.config:
https://home.comcast.net/~jpiszcz/20150306/3.15-working.txt
Proof:
p34:/r1# /usr/bin/time cp -r /nfs/atom/r1 .
35.79user 19018.69system 10:07:54elapsed 52%CPU (0avgtext+0avgdata
4484maxresident)k
0inputs+0outputs (0major+1443minor)pagefaults 0swaps
p34:/r1#
$ uname -r
Linux box 3.15.0 #1 SMP Sat Jul 12 09:54:17 EDT 2014 x86_64 GNU/Linux
$ zcat /proc/config.gz > ~/3.15-working.txt
I have not had time to perform a git bi-sect etc. but hopefully this helps
someone with the DMAR/PTE issue- 3.19 crashes consistently with any of these
configs:
http://home.comcast.net/~jpiszcz/20150306/config-3.19.0-1.txt
http://home.comcast.net/~jpiszcz/20150306/config-3.19.0-2.txt
http://home.comcast.net/~jpiszcz/20150306/config-3.19.0-3.txt
http://home.comcast.net/~jpiszcz/20150306/config-3.19.0-4.txt
$ diff -u 3.15-working.txt config-3.19.0-4.txt | grep -i DMAR
$ grep -i DMAR 3.15-working.txt config-3.19.0-4.txt
3.15-working.txt:CONFIG_DMAR_TABLE=y
config-3.19.0-4.txt:CONFIG_DMAR_TABLE=y
$ cp dmesg ~/dmesg-3.15.txt
$ cat dmesg.0 > ~/dmesg-3.19.txt
DMESG:
http://home.comcast.net/~jpiszcz/20150306/dmesg-3.15.txt (works 100%!)
http://home.comcast.net/~jpiszcz/20150306/dmesg-3.19.txt (crashes
consistently when copying files over NFS)
With the 3.15 kernel:
[ 0.058061] Freeing SMP alternatives memory: 28K (ffffffff81d92000 -
ffffffff81d99000)
[ 0.058208] dmar: Host address width 40
[ 0.058272] dmar: DRHD base: 0x000000f8dfe000 flags: 0x0
[ 0.058350] dmar: IOMMU 0: reg_base_addr f8dfe000 ver 1:0 cap
c90780106f0462 ecap f020f6
[ 0.058444] dmar: DRHD base: 0x000000fecfe000 flags: 0x1
[ 0.058515] dmar: IOMMU 1: reg_base_addr fecfe000 ver 1:0 cap
c90780106f0462 ecap f020f6
[ 0.058599] dmar: RMRR base: 0x000000000ec000 end: 0x000000000effff
[ 0.058667] dmar: RMRR base: 0x000000bf7ec000 end: 0x000000bf7fffff
[ 0.058734] dmar: ATSR flags: 0x0
[ 0.058796] dmar: ATSR flags: 0x0
[ 0.058859] dmar: RHSA base: 0x000000fecfe000 proximity domain: 0x0
[ 0.058927] dmar: RHSA base: 0x000000f8dfe000 proximity domain: 0x1
[ 0.059266] Switched APIC routing to physical flat.
With the 3.19 kernel:
[ 0.055785] Freeing SMP alternatives memory: 32K (ffffffff81faf000 -
ffffffff81fb7000)
[ 0.055939] dmar: Host address width 40
[ 0.056003] dmar: DRHD base: 0x000000f8dfe000 flags: 0x0
[ 0.056080] dmar: IOMMU 0: reg_base_addr f8dfe000 ver 1:0 cap
c90780106f0462 ecap f020f6
[ 0.056164] dmar: DRHD base: 0x000000fecfe000 flags: 0x1
[ 0.056237] dmar: IOMMU 1: reg_base_addr fecfe000 ver 1:0 cap
c90780106f0462 ecap f020f6
[ 0.056321] dmar: RMRR base: 0x000000000ec000 end: 0x000000000effff
[ 0.056389] dmar: RMRR base: 0x000000bf7ec000 end: 0x000000bf7fffff
[ 0.056466] dmar: ATSR flags: 0x0
[ 0.056528] dmar: ATSR flags: 0x0
[ 0.056591] dmar: RHSA base: 0x000000fecfe000 proximity domain: 0x0
[ 0.056658] dmar: RHSA base: 0x000000f8dfe000 proximity domain: 0x1
[ 0.056997] Switched APIC routing to physical flat.
Justin.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists