[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51B9DB23.7010609@nod.at>
Date: Thu, 13 Jun 2013 16:45:55 +0200
From: Richard Weinberger <richard@....at>
To: Michal Hocko <mhocko@...e.cz>
CC: LKML <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
cgroups mailinglist <cgroups@...r.kernel.org>,
"kamezawa.hiroyu@...fujitsu.com" <kamezawa.hiroyu@...fujitsu.com>,
bsingharora@...il.com, hannes@...xchg.org
Subject: Re: mem_cgroup_page_lruvec: BUG: unable to handle kernel NULL pointer
dereference at 00000000000001a8
Am 13.06.2013 16:39, schrieb Michal Hocko:
> On Thu 13-06-13 15:34:59, Richard Weinberger wrote:
>> Am 13.06.2013 15:32, schrieb Michal Hocko:
>>> Ohh and could you post the config please? Sorry should have asked
>>> earlier.
>>
>> See attachment.
>
> Nothing unusual there. Could you enable CONFIG_DEBUG_VM maybe it will
> help too catch the problem earlier.
OK
>>> On Thu 13-06-13 15:29:08, Michal Hocko wrote:
>>>>
>>>> On Thu 13-06-13 14:06:20, Richard Weinberger wrote:
>>>> [...]
>>>>> All code
>>>>> ========
>>>>> 0: 89 50 08 mov %edx,0x8(%rax)
>>>>> 3: 48 89 d1 mov %rdx,%rcx
>>>>> 6: 0f 1f 40 00 nopl 0x0(%rax)
>>>>> a: 49 8b 04 24 mov (%r12),%rax
>>>>> e: 48 89 c2 mov %rax,%rdx
>>>>> 11: 48 c1 e8 38 shr $0x38,%rax
>>>>> 15: 83 e0 03 and $0x3,%eax
>>>> nid = page_to_nid
>>>>> 18: 48 c1 ea 3a shr $0x3a,%rdx
>>>> zid = page_zonenum
>
> Ohh, I am wrong here. rdx should be nid and eax the zid.
>
>>>>
>>>>> 1c: 48 69 c0 38 01 00 00 imul $0x138,%rax,%rax
>>>>> 23: 48 03 84 d1 e0 02 00 add 0x2e0(%rcx,%rdx,8),%rax
>>>> &memcg->nodeinfo[nid]->zoneinfo[zid]
>>>>
>>>>> 2a: 00
>>>>> 2b:* 48 3b 58 70 cmp 0x70(%rax),%rbx <-- trapping instruction
>>>>
>>>> OK, so this maps to:
>>>> if (unlikely(lruvec->zone != zone)) <<<
>>>> lruvec->zone = zone;
>>>>
>>>>> [35355.883056] RSP: 0000:ffff88003d523aa8 EFLAGS: 00010002
>>>>> [35355.883056] RAX: 0000000000000138 RBX: ffff88003fffa600 RCX: ffff88003e04a800
>>>>> [35355.883056] RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000028500
>>>>> [35355.883056] RBP: ffff88003d523ab8 R08: 0000000000000000 R09: 0000000000000000
>>>>> [35355.883056] R10: 0000000000000000 R11: dead000000100100 R12: ffffea0000a14000
>>>>> [35355.883056] R13: ffff88003e04b138 R14: ffff88003d523bb8 R15: ffffea0000a14020
>>>>> [35355.883056] FS: 0000000000000000(0000) GS:ffff88003fd80000(0000)
>>>>
>>>> RAX (lruvec) is obviously incorrect and it doesn't make any sense. rax should
>>>> contain an address at an offset from ffff88003e04a800 But there is 0x138 there
>>>> instead.
>
> Hmm, now that I am looking at the registers again. RDX which should be
> nid seems to be quite big. It says this is node 32. Does the machine
> have really so many NUMA nodes?
No. It's a KVM guest with two CPUs. Nothing special.
qemu command line:
qemu-kvm -m 1G -drive file=lxc_host.qcow2,if=virtio -nographic -kernel linux/arch/x86/boot/bzImage -append console=ttyS0 root=/dev/vda2 -net user,hostfwd=tcp::5555-:22 -net
nic,model=e1000 -smp 4
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists