lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 6 Apr 2010 13:02:35 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Borislav Petkov <bp@...en8.de>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Minchan Kim <minchan.kim@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Nick Piggin <npiggin@...e.de>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	sgunderson@...foot.com
Subject: Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux
 2.6.34-rc3)



On Tue, 6 Apr 2010, Borislav Petkov wrote:
> 
> [ 2995.478125] PM: Preallocating image memory... 
> [ 2995.713692] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 2995.714001] IP: [<ffffffff810c194d>] page_referenced+0xee/0x1dc
> [ 2995.714001] PGD 22d1b8067 PUD 22dd85067 PMD 0 
> [ 2995.714001] Oops: 0000 [#1] PREEMPT SMP 
> [ 2995.714001] last sysfs file: /sys/power/state
> [ 2995.714001] CPU 0 
> [ 2995.714001] Modules linked in: tun powernow_k8 cpufreq_ondemand cpufreq_powersave cpufreq_userspace freq_table cpufreq_conservative binfmt_misc kvm_amd kvm ipv6 vfat fat dm_crypt dm_mod ohci_hcd pcspkr 8250_pnp 8250 k10temp edac_core serial_core
> [ 2995.714001] 
> [ 2995.714001] Pid: 7440, comm: hib.sh Not tainted 2.6.34-rc3-00288-gab195c5 #1 M3A78 PRO/System Product Name
> [ 2995.714001] RIP: 0010:[<ffffffff810c194d>]  [<ffffffff810c194d>] page_referenced+0xee/0x1dc
> [ 2995.714001] RSP: 0018:ffff88022fa038b8  EFLAGS: 00010283
> [ 2995.714001] RAX: ffff88022d747098 RBX: ffffea00078efb70 RCX: 0000000000000000
> [ 2995.714001] RDX: ffff88022fa03cf8 RSI: ffff88022d747070 RDI: ffff88022fb32520
> [ 2995.714001] RBP: ffff88022fa03938 R08: 0000000000000002 R09: 0000000000000000
> [ 2995.714001] R10: ffff88022fa038a8 R11: ffff88022d295d10 R12: 0000000000000000
> [ 2995.714001] R13: ffffffffffffffe0 R14: ffff88022d747058 R15: ffff88022fa03a00
> [ 2995.714001] FS:  00007f4da8b966f0(0000) GS:ffff88000a000000(0000) knlGS:0000000000000000
> [ 2995.714001] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 2995.714001] CR2: 0000000000000000 CR3: 000000022d11e000 CR4: 00000000000006f0
> [ 2995.714001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 2995.714001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 2995.714001] Process hib.sh (pid: 7440, threadinfo ffff88022fa02000, task ffff88022fb32520)
> [ 2995.714001] Stack:
> [ 2995.714001]  ffff88022d747098 00000000813fd2ac ffffffff8165ee28 0000000000000416
> [ 2995.714001] <0> ffff88022fa038f8 ffffffff810c6d40 ffffea00078fae60 ffffea00078fae60
> [ 2995.714001] <0> ffff88022fa03938 00000002810abd98 ffffea00078ec530 ffffea00078efb98
> [ 2995.714001] Call Trace:
> [ 2995.714001]  [<ffffffff810c6d40>] ? swapcache_free+0x37/0x3c
> [ 2995.714001]  [<ffffffff810ac31d>] shrink_page_list+0x171/0x4b1
> [ 2995.714001]  [<ffffffff813fd1e6>] ? _raw_spin_unlock_irq+0x30/0x58
> [ 2995.714001]  [<ffffffff810ac9b9>] shrink_inactive_list+0x35c/0x623
> [ 2995.714001]  [<ffffffff810acd94>] ? shrink_zone+0x114/0x3d4
> [ 2995.714001]  [<ffffffff81064f29>] ? print_lock_contention_bug+0x1b/0xe1
> [ 2995.714001]  [<ffffffff813fc790>] ? _raw_spin_lock_irq+0x19/0x79
> [ 2995.714001]  [<ffffffff810acf8a>] shrink_zone+0x30a/0x3d4
> [ 2995.714001]  [<ffffffff810ad19e>] ? shrink_slab+0x14a/0x15c
> [ 2995.714001]  [<ffffffff810adb65>] do_try_to_free_pages+0x176/0x27f
> [ 2995.714001]  [<ffffffff8103de67>] ? irq_exit+0x93/0x95
> [ 2995.714001]  [<ffffffff810add03>] shrink_all_memory+0x95/0xc4
> [ 2995.714001]  [<ffffffff810ab0f0>] ? isolate_pages_global+0x0/0x217
> [ 2995.714001]  [<ffffffff81077503>] ? count_data_pages+0x65/0x79
> [ 2995.714001]  [<ffffffff8107776a>] hibernate_preallocate_memory+0x1aa/0x2cb
> [ 2995.714001]  [<ffffffff813f95b5>] ? printk+0x41/0x44
> [ 2995.714001]  [<ffffffff810760b3>] hibernation_snapshot+0x36/0x1e1
> [ 2995.714001]  [<ffffffff8107632c>] hibernate+0xce/0x172
> [ 2995.714001]  [<ffffffff81075099>] state_store+0x5c/0xd3
> [ 2995.714001]  [<ffffffff8118728f>] kobj_attr_store+0x17/0x19
> [ 2995.714001]  [<ffffffff81127b69>] sysfs_write_file+0x108/0x144
> [ 2995.714001]  [<ffffffff810d66ff>] vfs_write+0xb2/0x153
> [ 2995.714001]  [<ffffffff810641a9>] ? trace_hardirqs_on_caller+0x1f/0x14b
> [ 2995.714001]  [<ffffffff810d6863>] sys_write+0x4a/0x71
> [ 2995.714001]  [<ffffffff810021db>] system_call_fastpath+0x16/0x1b
> [ 2995.714001] Code: 3b 56 10 73 1e 48 83 fa f2 74 18 48 8d 4d cc 4d 89 f8 48 89 df e8 4d f2 ff ff 41 01 c4 83 7d cc 00 74 19 4d 8b 6d 20 49 83 ed 20 <49> 8b 45 20 0f 18 08 49 8d 45 20 48 39 45 80 75 aa 4c 89 f7 e8 
> [ 2995.714001] RIP  [<ffffffff810c194d>] page_referenced+0xee/0x1dc
> [ 2995.714001]  RSP <ffff88022fa038b8>
> [ 2995.714001] CR2: 0000000000000000
> [ 2995.729717] ---[ end trace 92c25d74e4800968 ]---

So again, I can show that the code has never actually been through the 
loop. The above code decodes to:

   0:	3b 56 10             	cmp    0x10(%rsi),%edx
   3:	73 1e                	jae    0x23
   5:	48 83 fa f2          	cmp    $0xfffffffffffffff2,%rdx
   9:	74 18                	je     0x23
   b:	48 8d 4d cc          	lea    -0x34(%rbp),%rcx
   f:	4d 89 f8             	mov    %r15,%r8
  12:	48 89 df             	mov    %rbx,%rdi
  15:	e8 4d f2 ff ff       	callq  0xfffffffffffff267
  1a:	41 01 c4             	add    %eax,%r12d
  1d:	83 7d cc 00          	cmpl   $0x0,-0x34(%rbp)
  21:	74 19                	je     0x3c
  23:	4d 8b 6d 20          	mov    0x20(%r13),%r13
  27:	49 83 ed 20          	sub    $0x20,%r13
  2b:*	49 8b 45 20          	mov    0x20(%r13),%rax     <-- trapping instruction
  2f:	0f 18 08             	prefetcht0 (%rax)
  32:	49 8d 45 20          	lea    0x20(%r13),%rax
  36:	48 39 45 80          	cmp    %rax,-0x80(%rbp)
  3a:	75 aa                	jne    0xffffffffffffffe6
  3c:	4c 89 f7             	mov    %r14,%rdi
  3f:	e8                   	.byte 0xe8

and in your case, if we had gone through the loop, then %rax would still 
contain the return value from page_referenced_one(). 

But %rax is a kernel pointer, and %r12d is 0.

So again, it's actually anon_vma.head.next that is NULL, not any of the 
entries on the list itself.

Now, I can see several cases for this:

 - the obvious one: anon_vma just wasn't correctly initialized, and is 
   missing a INIT_LIST_HEAD(&anon_vma->head). That's either a slab bug (we 
   don't have a whole lot of coverage of constructors), or somebody 
   allocated an anon_vma without using the anon_vma_cachep.

 - Related to the above: perhaps the RCU freeing isn't working, or 
   slub/slab/slob ends up reusing the allocations for something else than 
   anonvma's, so together with the race _and_ an unlucky re-use, you get 
   some odd crud.

   I haven't looked at the kernel config files: do they perhaps share the 
   same (odd?) SLUB/SLAB/SLOB config?

 - anon_vma isn't actually an anonvma at all. 'page->mapping' was crud 
   with the low bit set. That sounds unlikely, but who knows. The ksm code 
   sets mapping to "stable_node + PAGE_MAPPING_ANON | PAGE_MAPPING_KSM"

   Did people have KSM enabled?

.. and probably other things I haven't even thought about.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ