linux-kernel - Re: 2.6.18 BUG: unable to handle kernel NULL pointer dereference at virtual address 000,0000a

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20060923134244.e7b73826.akpm@osdl.org>
Date:	Sat, 23 Sep 2006 13:42:44 -0700
From:	Andrew Morton <akpm@...l.org>
To:	Christian Weiske <cweiske@...iske.de>
Cc:	linux-kernel@...r.kernel.org, reiserfs-dev@...esys.com,
	Ingo Molnar <mingo@...e.hu>,
	Nick Piggin <nickpiggin@...oo.com.au>
Subject: Re: 2.6.18 BUG: unable to handle kernel NULL pointer dereference at
 virtual address 000,0000a


cc's added.  This looks quite serious.

On Sat, 23 Sep 2006 17:56:05 +0200
Christian Weiske <cweiske@...iske.de> wrote:

> Hello,
> 
> 
> I have a reproducible BUG on my server that occurs whenever disk usage
> gets too high / too much swapping occurs (at least I think that is). The
> box has one reiserfs filesystem of about 187GB size, the disk is on an
> Epia 5000 board, between them is a Promise Ultra 100 PCI IDE controller
> card.
> 

Do you think this bug is due to the 2.6.18 upgrade?

Have you run fsck across the filesystem(s)?

Does the oops always look the same as this one?

Please turn on the various CONFIG_DEBUG_* options, see if that turns up
anything.

It would be interesting to find out if enabling CONFIG_4KSTACKS makes this
go away (although I'm not sure why).

This looks more like a bug in the CPU scheduler than in the filesystem.

p->thread_info is NULL in scheduler_tick()'s first call to
set_tsk_need_resched(), at line 3008.

Thanks.

> 
> Any hints about how to resolve this problem are very welcome.
> 
> 
> The trace from the serial console:
> -------------
> Oops: 0002 [#1]
> PREEMPT
> Modules linked in:
> CPU:    0
> EIP:    0060:[<c0112a54>]    Not tainted VLI
> EFLAGS: 00010013   (2.6.18 #1)
> EIP is at scheduler_tick+0x84/0x340
> eax: 00000002   ebx: c7eec590   ecx: c5e960d5   edx: 4457222b
> esi: c5e96100   edi: 0000002b   ebp: c7f43864   esp: c7f43850
> ds: 007b   es: 007b   ss: 0068
> Process  (pid: 6820, ti=c7f42000 task=c7eec590 task.ti=00000002)
> Stack: 00000000 c7eec590 c7eec590 00000000 00000000 c7f438d0 c0120c83
> c7f438d0
>        00000000 c010597b 00000000 c04fbe00 c013d785 00000000 00000000
> c7f438d0
>        c056ea00 00000000 c04fbe00 c7f438d0 c013d833 00000000 c7f438d0
> c04fbe00
> Call Trace:
>  [<c0120c83>] update_process_times+0x33/0x80
>  [<c010597b>] timer_interrupt+0x3b/0x70
>  [<c013d785>] handle_IRQ_event+0x35/0x70
>  [<c013d833>] __do_IRQ+0x73/0x100
>  [<c01047f5>] do_IRQ+0x25/0x50
>  [<c0102e7a>] common_interrupt+0x1a/0x20
>  [<c028300e>] _mmx_memcpy+0x6e/0x180
>  [<c01b69f6>] leaf_copy_items+0x36/0x100
>  [<c0282f1c>] memcpy+0x3c/0x50
>  [<c0282f88>] memmove+0x38/0x50
>  [<c01b72c5>] leaf_paste_in_buffer+0xa5/0x340
>  [<c019fc4c>] balance_leaf+0x2cc/0x2e10
>  [<c01af706>] get_parents+0x106/0x1a0
>  [<c01a2ac1>] do_balance+0x61/0xf0
>  [<c01b0d41>] wait_tb_buffers_until_unlocked+0x211/0x280
>  [<c01b0f46>] fix_nodes+0x196/0x3d0
>  [<c01bd3b6>] reiserfs_paste_into_item+0x196/0x1c0
>  [<c01ab701>] reiserfs_allocate_blocks_for_region+0x971/0x13c0
>  [<c01baea4>] search_for_position_by_key+0x134/0x330
>  [<c013f6a6>] add_to_page_cache+0x46/0xc0
>  [<c0162f92>] alloc_buffer_head+0x12/0x50
>  [<c0160385>] alloc_page_buffers+0x65/0xc0
>  [<c01a5606>] make_cpu_key+0x36/0x40
>  [<c01b9b16>] pathrelse+0x26/0x40
>  [<c01ad7a4>] reiserfs_file_write+0x694/0x720
>  [<c01404f6>] __generic_file_aio_read+0x196/0x210
>  [<c0140280>] file_read_actor+0x0/0xe0
>  [<c012039c>] change_clocksource+0xc/0x140
>  [<c0120b4d>] update_wall_time+0x18d/0x290
>  [<c012b0c0>] autoremove_wake_function+0x0/0x40
>  [<c0112c65>] scheduler_tick+0x295/0x340
>  [<c015e254>] vfs_write+0x84/0x150
>  [<c015e3cd>] sys_write+0x3d/0x70
>  [<c0102c17>] syscall_call+0x7/0xb
> Code: da 8b 5d f0 01 4b 50 11 53 54 39 1d 04 5d 5a c0 89 35 f8 5c 5a c0
> 89 3d fc
>  5c 5a c0 74 12 a1 0c 5d 5a c0 39 43 30 74 1f 8b 43 04 <0f> ba 68 08 03
> 8d 65 f4
>  5b 5e 5f 5d c3 eb 0d 90 90 90 90 90 90
> EIP: [<c0112a54>] scheduler_tick+0x84/0x340 SS:ESP 0068:c7f43850
>  <1>BUG: unable to handle kernel NULL pointer dereference at virtual
> address 000
> 0000a
>  printing eip:
> c01123b2
> *pde = 00000000
> Oops: 0002 [#2]
> PREEMPT
> Modules linked in:
> CPU:    0
> EIP:    0060:[<c01123b2>]    Not tainted VLI
> EFLAGS: 00010097   (2.6.18 #1)
> EIP is at try_to_wake_up+0x52/0xb0
> eax: 00000002   ebx: cf79fa90   ecx: cf79fab8   edx: c7eec590
> esi: c05a5ce0   edi: 00000000   ebp: c7f436c8   esp: c7f436b8
> ds: 007b   es: 007b   ss: 0068
> Process  (pid: 6820, ti=c7f42000 task=c7eec590 task.ti=00000002)
> Stack: 00000012 00000000 c04fbfcc 00000001 c7f436ec c0112d66 cf79fa90
> 00000001
>        00000000 00000000 c7f42000 00000000 00000012 c7f43714 c0112dc2
> c04fbfcc
>        00000001 00000001 00000000 00000000 000031f8 00000046 000031f8
> fffff5d8
> Call Trace:
>  [<c0112d66>] __wake_up_common+0x36/0x70
>  [<c0112dc2>] __wake_up+0x22/0x50
>  [<c011786a>] release_console_sem+0xda/0x100
>  [<c01175af>] vprintk+0x18f/0x2b0
>  [<c01176b9>] vprintk+0x299/0x2b0
>  [<c010323d>] show_stack_log_lvl+0x8d/0xb0
>  [<c0112a68>] scheduler_tick+0x98/0x340
>  [<c011740f>] printk+0xf/0x20
>  [<c010ded3>] bust_spinlocks+0x43/0x50
>  [<c0103575>] die+0x85/0x210
>  [<c010e1c0>] do_page_fault+0x0/0x570
>  [<c010e490>] do_page_fault+0x2d0/0x570
>  [<c0112d66>] __wake_up_common+0x36/0x70
>  [<c010e1c0>] do_page_fault+0x0/0x570
>  [<c0102ec9>] error_code+0x39/0x40
>  [<c0112a54>] scheduler_tick+0x84/0x340
>  [<c0120c83>] update_process_times+0x33/0x80
>  [<c010597b>] timer_interrupt+0x3b/0x70
>  [<c013d785>] handle_IRQ_event+0x35/0x70
>  [<c013d833>] __do_IRQ+0x73/0x100
>  [<c01047f5>] do_IRQ+0x25/0x50
>  [<c0102e7a>] common_interrupt+0x1a/0x20
>  [<c028300e>] _mmx_memcpy+0x6e/0x180
>  [<c01b69f6>] leaf_copy_items+0x36/0x100
>  [<c0282f1c>] memcpy+0x3c/0x50
>  [<c0282f88>] memmove+0x38/0x50
>  [<c01b72c5>] leaf_paste_in_buffer+0xa5/0x340
>  [<c019fc4c>] balance_leaf+0x2cc/0x2e10
>  [<c01af706>] get_parents+0x106/0x1a0
>  [<c01a2ac1>] do_balance+0x61/0xf0
>  [<c01b0d41>] wait_tb_buffers_until_unlocked+0x211/0x280
>  [<c01b0f46>] fix_nodes+0x196/0x3d0
>  [<c01bd3b6>] reiserfs_paste_into_item+0x196/0x1c0
>  [<c01ab701>] reiserfs_allocate_blocks_for_region+0x971/0x13c0
>  [<c01baea4>] search_for_position_by_key+0x134/0x330
>  [<c013f6a6>] add_to_page_cache+0x46/0xc0
>  [<c0162f92>] alloc_buffer_head+0x12/0x50
>  [<c0160385>] alloc_page_buffers+0x65/0xc0
>  [<c01a5606>] make_cpu_key+0x36/0x40
>  [<c01b9b16>] pathrelse+0x26/0x40
>  [<c01ad7a4>] reiserfs_file_write+0x694/0x720
>  [<c01404f6>] __generic_file_aio_read+0x196/0x210
>  [<c0140280>] file_read_actor+0x0/0xe0
>  [<c012039c>] change_clocksource+0xc/0x140
>  [<c0120b4d>] update_wall_time+0x18d/0x290
>  [<c012b0c0>] autoremove_wake_function+0x0/0x40
>  [<c0112c65>] scheduler_tick+0x295/0x340
>  [<c015e254>] vfs_write+0x84/0x150
>  [<c015e3cd>] sys_write+0x3d/0x70
>  [<c0102c17>] syscall_call+0x7/0xb
> Code: 3d 83 f8 02 74 63 a8 40 75 62 6a 01 56 53 e8 f6 fe ff ff 8b 45 10
> 83 c4 0c
>  85 c0 75 1c 8b 56 20 8b 42 1c 39 43 1c 7d 11 8b 42 04 <0f> ba 68 08 03
> 89 f6 8d
>  bc 27 00 00 00 00 bf 01 00 00 00 c7 03
> EIP: [<c01123b2>] try_to_wake_up+0x52/0xb0 SS:ESP 0068:c7f436b8
>  <0>Kernel panic - not syncing: Fatal exception in interrupt
> -------------
> 
> 
> # cat /proc/cpuinfo
> processor       : 0
> vendor_id       : CentaurHauls
> cpu family      : 6
> model           : 7
> model name      : VIA Samuel 2
> stepping        : 3
> cpu MHz         : 533.373
> cache size      : 64 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 1
> wp              : yes
> flags           : fpu de tsc msr cx8 mtrr pge mmx 3dnow
> bogomips        : 1068.09
> 
> 
> # ./scripts/ver_linux
> If some fields are empty or look unusual you may have an old version.
> Compare to the current minimal requirements in Documentation/Changes.
> 
> Linux dojo 2.6.18 #1 PREEMPT Sat Sep 23 16:24:51 Local time zone must be
> set--see  i686 VIA Samuel 2 GNU/Linux
> 
> Gnu C                  3.4.6
> Gnu make               3.80
> binutils               2.16.1
> util-linux             2.12r
> mount                  2.12r
> module-init-tools      3.2.1
> e2fsprogs              1.38
> reiserfsprogs          3.6.19
> Linux C Library        2.3.6
> Dynamic linker (ldd)   2.3.6
> Procps                 3.2.6
> Net-tools              1.60
> Kbd                    1.12
> Sh-utils               5.94
> udev                   087
> Modules Loaded
> 
> 
> 
> 
> Please CC me as I am not subscribed.
> 
> -- 
> Regards/MfG,
> Christian Weiske
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/