lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 13 May 2015 01:17:54 -0700
From:	Haren Myneni <hmyneni@...il.com>
To:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	linuxppc-dev@...ts.ozlabs.org
Cc:	Haren Myneni <hbabu@...ibm.com>, aneesh.kumar@...ux.vnet.ibm.com,
	srikar@...ux.vnet.ibm.com
Subject: mm: BUG_ON with NUMA_BALANCING (kernel BUG at include/linux/swapops.h:131!)

Hi,

 I am getting BUG_ON in migration_entry_to_page() with 4.1.0-rc2
kernel on powerpc system which has 512 CPUs (64 cores - 16 nodes) and
1.6 TB memory. We can easily recreate this issue with kernel compile
(make -j500). But I could not reproduce with numa_balancing=disable.

------------[ cut here ]------------
kernel BUG at include/linux/swapops.h:134!
cpu 0x154: Vector: 700 (Program Check) at [c00009cf365c7610]
    pc: c00000000021e48c: remove_migration_pte+0x29c/0x450
    lr: c00000000021e47c: remove_migration_pte+0x28c/0x450
    sp: c00009cf365c7890
   msr: 8000000002029033
  current = 0xc00009cf36525fc0
  paca    = 0xc00000000e80fa00   softe: 0        irq_happened: 0x01
    pid   = 244969, comm = cc1
kernel BUG at include/linux/swapops.h:134!
enter ? for help
[c00009cf365c7960] c0000000001f3228 rmap_walk+0x348/0x460
[c00009cf365c7a10] c0000000008d8804 remove_migration_ptes+0x6c/0x84
[c00009cf365c7ab0] c000000000220d2c migrate_pages+0xaac/0xd20
[c00009cf365c7c00] c0000000002218cc migrate_misplaced_page+0x12c/0x210
[c00009cf365c7ca0] c0000000001e613c handle_mm_fault+0xa4c/0x17d0
[c00009cf365c7d70] c0000000008d1098 do_page_fault+0x3a8/0x800
[c00009cf365c7e30] c000000000008664 handle_page_fault+0x10/0x30

I think we are hitting this race issue when the migrate entry page is
not locked.

dump_page() for *old page:

page:f00000035f36a5a0 count:1 mapcount:0 mapping:c00009cf3d351311
index:0x3ffffffe
flags: 0x93ffff800080009(locked|uptodate|swapbacked)

dump_page() for migrate entry page:

page:f00000009f36a5a0 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x13ffff800000000()

Any suggestions on how to debug this issue?

Thanks
Haren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ