lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 4 Jun 2008 18:36:01 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Tristan Linnenbank <tristan@...e.nl>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: file_splice_read problem in 2.6.24.2?

On Wed, Jun 04 2008, Tristan Linnenbank wrote:
> Dear lkml,
> 
> this afternoon I had a kernel crash on one of my webboxes. 
> Halting/rebooting the machine after the crash was not possible. I 
> had to power cycle it.
> 
> Pid: 22361, comm: apache2 Not tainted (2.6.24.2-fwsh-byte #2)
> EIP: 0060:[<c0140967>] EFLAGS: 00000286 CPU: 0
> EIP is at find_get_pages_contig+0x67/0x73
> EAX: 00000000 EBX: 00000010 ECX: c1c75e20 EDX: c1c75e20
> ESI: 00000010 EDI: de5cb920 EBP: 00000010 ESP: d43b7cd8
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> CR0: 8005003b CR2: b77f8e04 CR3: 0c78a000 CR4: 000006f0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
>  [<c017c49c>] __generic_file_splice_read+0xa2/0x41e
>  [<c0132efc>] clocksource_get_next+0x3a/0x40
>  [<c0113b11>] sched_slice+0x15/0x6f
>  [<c0110cb8>] read_hpet+0xa/0xd
>  [<c0131291>] getnstimeofday+0x31/0x105
>  [<f905ee38>] kcs_event+0xb0/0x690 [ipmi_si]
>  [<c0134301>] clockevents_program_event+0xbf/0x134
>  [<f905c07d>] start_next_msg+0x14/0xa1 [ipmi_si]
>  [<c0122ed9>] lock_timer_base+0x27/0x51
>  [<c0122f83>] __mod_timer+0x80/0x8e
>  [<f905c9ba>] smi_timeout+0x0/0xfe [ipmi_si]
>  [<c0123289>] run_timer_softirq+0xcf/0x184
>  [<c012a893>] __rcu_process_callbacks+0x76/0xbb
>  [<c011f979>] tasklet_action+0x53/0x93
>  [<c011f754>] __do_softirq+0xba/0xcf
>  [<c017c88d>] generic_file_splice_read+0x75/0xc9
>  [<c01eda5c>] nfs_file_splice_read+0x67/0x9d
>  [<c017d083>] do_splice_to+0x6e/0x90
>  [<c017d144>] splice_direct_to_actor+0x9f/0x166
>  [<c017d20b>] direct_splice_actor+0x0/0x31
>  [<c017d2a4>] do_splice_direct+0x68/0x8b
>  [<c016141a>] do_readv_writev+0x130/0x193
>  [<c01617ff>] do_sendfile+0x1f5/0x256
>  [<c01618b8>] sys_sendfile+0x58/0xa5
>  [<c0102836>] sysenter_past_esp+0x5f/0x85
>  =======================
> 
> pid 22361 was an apache2 process.
> the "-fwsh-byte" suffix to the kernel string indicates a 
> forwarded-share patch to the kernel.
> 
> We (=the company I work for) had similar kernel crashes before (
> see http://article.gmane.org/gmane.linux.nfs/19130, and 
> http://article.gmane.org/gmane.linux.nfs/19107). Those crashes were 
> on nfs servers, but the webbox is an nfs client.
> 
> We switched the webbox to kernel 2.5.25.4 to test if that will fix 
> the problem.
> 
> Are there any more people that have experienced this issue before?
> 
> What information can I provide to ease debugging?
> 
> As I am not a member of LKML, could you please CC me in the replies 
> to the list?

So either this is fixed by this:

http://git.kernel.dk/?p=linux-2.6.git;a=commit;h=8191ecd1d14c6914c660dfa007154860a7908857

or it's a different bug. You should post the full oops (including any
message that came before the oops, like the 'locked up for foo seconds'
in the urls you reference above) with the Code line at the bottom as
well so we can see what the registers are used for.

If it's the bug fixed with the above commit, then 2.6.25.x should
work. Unfortunately I'm unsure of the -stable status of the above
patch.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists