lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5019ACC6.1040501@cora.nwra.com>
Date:	Wed, 01 Aug 2012 16:25:10 -0600
From:	Orion Poplawski <orion@...a.nwra.com>
To:	linux-kernel@...r.kernel.org
Subject: Need help debugging crazy kernel memory issue

I recently started experiencing crashes (every 1-2 days) on one of my 
ScientificLinux 6.2 boxes.  It appears that the machine runs out of memory, 
but the memory report makes no sense.

I'm seeing it with each of the following kernels:

kernel-2.6.32-220.17.1.el6.x86_64
kernel-2.6.32-220.23.1.el6.x86_64
kernel-2.6.32-279.1.1.el6.x86_64

kernel-2.6.32-220.17.1.el6.x86_64 had run fine for a long time, but reverting 
to it has not solved the problem.  I have tried to revert all other updates 
from around the time the problems started to no avail.

It seems like everything about the memory config goes screwy.  Here's an info 
dump from normal operation (48GB RAM, 8GB swap):

SysRq : Show Memory
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
CPU    4: hi:    0, btch:   1 usd:   0
CPU    5: hi:    0, btch:   1 usd:   0
CPU    6: hi:    0, btch:   1 usd:   0
CPU    7: hi:    0, btch:   1 usd:   0
CPU    8: hi:    0, btch:   1 usd:   0
CPU    9: hi:    0, btch:   1 usd:   0
CPU   10: hi:    0, btch:   1 usd:   0
CPU   11: hi:    0, btch:   1 usd:   0
CPU   12: hi:    0, btch:   1 usd:   0
CPU   13: hi:    0, btch:   1 usd:   0
CPU   14: hi:    0, btch:   1 usd:   0
CPU   15: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 180
CPU    1: hi:  186, btch:  31 usd: 158
CPU    2: hi:  186, btch:  31 usd:   0
CPU    3: hi:  186, btch:  31 usd:   0
CPU    4: hi:  186, btch:  31 usd:   0
CPU    5: hi:  186, btch:  31 usd:   0
CPU    6: hi:  186, btch:  31 usd:   0
CPU    7: hi:  186, btch:  31 usd:   0
CPU    8: hi:  186, btch:  31 usd:   0
CPU    9: hi:  186, btch:  31 usd:  19
CPU   10: hi:  186, btch:  31 usd:  30
CPU   11: hi:  186, btch:  31 usd:   0
CPU   12: hi:  186, btch:  31 usd:   0
CPU   13: hi:  186, btch:  31 usd:   0
CPU   14: hi:  186, btch:  31 usd:   0
CPU   15: hi:  186, btch:  31 usd:   0
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 144
CPU    1: hi:  186, btch:  31 usd: 124
CPU    2: hi:  186, btch:  31 usd: 182
CPU    3: hi:  186, btch:  31 usd: 162
CPU    4: hi:  186, btch:  31 usd: 182
CPU    5: hi:  186, btch:  31 usd: 162
CPU    6: hi:  186, btch:  31 usd:   0
CPU    7: hi:  186, btch:  31 usd:   0
CPU    8: hi:  186, btch:  31 usd: 138
CPU    9: hi:  186, btch:  31 usd: 119
CPU   10: hi:  186, btch:  31 usd:  88
CPU   11: hi:  186, btch:  31 usd:  75
CPU   12: hi:  186, btch:  31 usd:   0
CPU   13: hi:  186, btch:  31 usd: 183
CPU   14: hi:  186, btch:  31 usd: 179
CPU   15: hi:  186, btch:  31 usd: 159
Node 1 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 184
CPU    1: hi:  186, btch:  31 usd: 168
CPU    2: hi:  186, btch:  31 usd: 164
CPU    3: hi:  186, btch:  31 usd:   0
CPU    4: hi:  186, btch:  31 usd:  41
CPU    5: hi:  186, btch:  31 usd: 172
CPU    6: hi:  186, btch:  31 usd: 126
CPU    7: hi:  186, btch:  31 usd: 145
CPU    8: hi:  186, btch:  31 usd:  63
CPU    9: hi:  186, btch:  31 usd: 158
CPU   10: hi:  186, btch:  31 usd: 162
CPU   11: hi:  186, btch:  31 usd: 165
CPU   12: hi:  186, btch:  31 usd:  48
CPU   13: hi:  186, btch:  31 usd:  64
CPU   14: hi:  186, btch:  31 usd:  66
CPU   15: hi:  186, btch:  31 usd: 165
active_anon:1356095 inactive_anon:839 isolated_anon:0
  active_file:93487 inactive_file:384620 isolated_file:0
  unevictable:0 dirty:65 writeback:0 unstable:0
  free:10288450 slab_reclaimable:40068 slab_unreclaimable:34613
  mapped:9236 shmem:474 pagetables:6530 bounce:0
Node 0 DMA free:15396kB min:36kB low:44kB high:52kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:14984kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB 
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2991 24201 24201
Node 0 DMA32 free:2563944kB min:8092kB low:10112kB high:12136kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3063392kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB 
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 21210 21210
Node 0 Normal free:16895636kB min:57372kB low:71712kB high:86056kB 
active_anon:2986668kB inactive_anon:92kB active_file:278460kB 
inactive_file:1316128kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:21719040kB mlocked:0kB dirty:120kB writeback:0kB mapped:18948kB 
shmem:360kB slab_reclaimable:123620kB slab_unreclaimable:95556kB 
kernel_stack:7024kB pagetables:12872kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 1 Normal free:21678824kB min:65568kB low:81960kB high:98352kB 
active_anon:2437712kB inactive_anon:3264kB active_file:95488kB 
inactive_file:222352kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:24821760kB mlocked:0kB dirty:140kB writeback:0kB mapped:17996kB 
shmem:1536kB slab_reclaimable:36652kB slab_unreclaimable:42896kB 
kernel_stack:696kB pagetables:13248kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB 2*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 
1*2048kB 3*4096kB = 15396kB
Node 0 DMA32: 6*4kB 12*8kB 3*16kB 6*32kB 8*64kB 16*128kB 6*256kB 7*512kB 
8*1024kB 6*2048kB 619*4096kB = 2563944kB
Node 0 Normal: 1832*4kB 1354*8kB 846*16kB 583*32kB 272*64kB 65*128kB 70*256kB 
53*512kB 37*1024kB 2*2048kB 4085*4096kB = 16895280kB
Node 1 Normal: 509*4kB 2146*8kB 1166*16kB 831*32kB 346*64kB 197*128kB 
200*256kB 202*512kB 192*1024kB 131*2048kB 5114*4096kB = 21678276kB
478555 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 8388600kB
Total swap = 8388600kB
12582896 pages RAM
227482 pages reserved
450152 pages shared
1712399 pages non-shared

and here's the initial oom and Mem-Info message:

lvm invoked oom-killer: gfp_mask=0x201d0, order=0, oom_adj=0, oom_score_adj=0
lvm cpuset=/ mems_allowed=0
Pid: 3405, comm: lvm Not tainted 2.6.32-279.1.1.el6.x86_64 #1
Call Trace:
  [<ffffffff810c4981>] ? cpuset_print_task_mems_allowed+0x91/0xb0
  [<ffffffff811170f0>] ? dump_header+0x90/0x1b0
  [<ffffffff8121470c>] ? security_real_capable_noaudit+0x3c/0x70
  [<ffffffff81117572>] ? oom_kill_process+0x82/0x2a0
  [<ffffffff811174b1>] ? select_bad_process+0xe1/0x120
  [<ffffffff811179b0>] ? out_of_memory+0x220/0x3c0
  [<ffffffff811b3380>] ? blkdev_get_block+0x0/0x70
  [<ffffffff811276ce>] ? __alloc_pages_nodemask+0x89e/0x940
  [<ffffffff8115c1ea>] ? alloc_pages_current+0xaa/0x110
  [<ffffffff811144f7>] ? __page_cache_alloc+0x87/0x90
  [<ffffffff81113ede>] ? find_get_page+0x1e/0xa0
  [<ffffffff8111606b>] ? do_read_cache_page+0x4b/0x180
  [<ffffffff811b4330>] ? blkdev_readpage+0x0/0x20
  [<ffffffff811161e9>] ? read_cache_page_async+0x19/0x20
  [<ffffffff811161fe>] ? read_cache_page+0xe/0x20
  [<ffffffff811ecaa0>] ? read_dev_sector+0x30/0xa0
  [<ffffffff811edc5d>] ? amiga_partition+0x6d/0x460
  [<ffffffff811161e9>] ? read_cache_page_async+0x19/0x20
  [<ffffffff811ecaa0>] ? read_dev_sector+0x30/0xa0
  [<ffffffff811ef1ac>] ? osf_partition+0x6c/0x120
  [<ffffffff811ed7d7>] ? rescan_partitions+0x1a7/0x470
  [<ffffffff811b4ab6>] ? __blkdev_get+0x1b6/0x3c0
  [<ffffffff811b4ce0>] ? blkdev_open+0x0/0xc0
  [<ffffffff811b4cd0>] ? blkdev_get+0x10/0x20
  [<ffffffff811b4d51>] ? blkdev_open+0x71/0xc0
  [<ffffffff8117889a>] ? __dentry_open+0x10a/0x360
  [<ffffffff8121c272>] ? selinux_inode_permission+0x72/0xb0
  [<ffffffff812142af>] ? security_inode_permission+0x1f/0x30
  [<ffffffff81178c04>] ? nameidata_to_filp+0x54/0x70
  [<ffffffff8118c110>] ? do_filp_open+0x6c0/0xd60
  [<ffffffff81198192>] ? alloc_fd+0x92/0x160
  [<ffffffff81178649>] ? do_sys_open+0x69/0x140
  [<ffffffff81178760>] ? sys_open+0x20/0x30
  [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b
Mem-Info:
Node 0 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Node 0 DMA32 per-cpu:
CPU 0: hi: 42, btch: 7 usd: 23
active_anon:49 inactive_anon:97 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:3846 dirty:0 writeback:0 unstable:0
free:412 slab_reclaimable:1194 slab_unreclaimable:5681
mapped:356 shmem:0 pagetables:31 bounce:0
Node 0 DMA free:224kB min:0kB low:0kB high:0kB active_anon:0kB 
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:328kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB 
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 125 125 125
Node 0 DMA32 free:1424kB min:1428kB low:1784kB high:2140kB active_anon:196kB 
inactive_anon:388kB active_file:0kB inactive_file:0kB unevictable:15384kB 
isolated(anon):0kB isolated(file):0kB present:128256kB mlocked:0kB dirty:0kB 
writeback:0kB mapped:1424kB shmem:0kB slab_reclaimable:4776kB 
slab_unreclaimable:22724kB kernel_stack:600kB pagetables:124kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 0*4kB 2*8kB 1*16kB 2*32kB 2*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 224kB
Node 0 DMA32: 0*4kB 2*8kB 2*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 
1*1024kB 0*2048kB 0*4096kB = 1424kB45035
3846 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
45035 pages RAM
16585 pages reserved
359 pages shared
23771 pages non-shared


Only 1 cpu listed, memory numbers are incredibly small (180MB of RAM?! - no 
wonder it is out of memory), no swap, no Normal nodes listed, etc.

It's fairly consistent, I see the same # of pages RAM each time it crashes.  I 
ran memcheck through test #4 with no errors.

cmdline:

ro root=/dev/mapper/vg_root-root 
rd_MD_UUID=486b6486:65829f41:e3ccc1e2:ace1579a rd_LVM_LV=vg_root/root 
rd_LVM_LV=vg_root/swap rd_NO_LUKS rd_NO_DM LANG=en_US.UTF-8 
SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us 
crashkernel=512M-2G:64M,2G-:128M   console=tty0 console=ttyS0,115200


Any ideas?  I'm at a loss.

-- 
Orion Poplawski
Technical Manager                     303-415-9701 x222
NWRA, Boulder Office                  FAX: 303-415-9702
3380 Mitchell Lane                       orion@...a.com
Boulder, CO 80301                   http://www.nwra.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ