[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <46C62CF5.9080502@gmx.net>
Date: Sat, 18 Aug 2007 01:19:17 +0200
From: Dimitrios Apostolou <jimis@....net>
To: RafaĆ Bilski <rafalbilski@...eria.pl>
CC: linux-kernel@...r.kernel.org, Alan Cox <alan@...rguk.ukuu.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: high system cpu load during intense disk i/o
Hello list,
before trying to reproduce the problem with older kernels I did the
necessary step of compiling and using a *vanilla* simple monolithic
kernel for my measurements. The kernel config (attached config.gz) has
many standard things disabled (like ACPI for example) so the oprofile
output now seems very different. Please keep in mind that I switched
back from libata to the old IDE driver, to be able to use the same
config on old kernels.
The situations I attach are:
idle: The PC doing nothing. Note that now idle time is spent in
irq_handler and not in poll_idle. Strange...
one_disk: Destructive badblocks (badblocks -v -w) on one disk.
Everything is responsive and the CPU is 99% iowait as it should.
two_disks: Destructive badblocks on two disks before the problem
appears. Things are starting to get sluggy.
two_disks_bad2: *PROBLEM* The previous situation after several
minutes, and after several cron jobs kicked in (and never finished).
System in a bad state, highly unresponsive.
The situation seems now completely different (but practically the
problem is exactly the same), probably because of kernel options. Here
are the first lines from opreport with debugging info, for the
two_disks_bad2 scenario:
CPU: PIII, speed 798.02 MHz (estimated)
Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a
unit mask of 0x00 (No unit mask) count 100000
samples % linenr info symbol name
282 9.3377 ide-iops.c:1081 pre_reset
231 7.6490 stats.c:187 rpc_print_iostats
222 7.3510 ptrace.c:654 do_syscall_trace
146 4.8344 ide-io.c:1185 ide_do_request
144 4.7682 process.c:529 dump_task_regs
131 4.3377 stats.c:64 rpc_proc_open
122 4.0397 backing-dev.c:46 congestion_wait
98 3.2450 vsprintf.c:622 vsscanf
52 1.7219 process.c:643 __switch_to
33 1.0927 sched.c:4065 interruptible_sleep_on
32 1.0596 slub.c:597 check_object
32 1.0596 signal.c:244 setup_sigcontext
32 1.0596 signal.c:56 sys_sigaction
31 1.0265 buffer.c:2452 block_truncate_page
31 1.0265 fadvise.c:28 sys_fadvise64_64
31 1.0265 page-writeback.c:987 test_set_page_writeback
If you think I should enable/disable other options in the kernel please
tell me. Moreover it would be nice to know how to use the various
debugging options that I enabled, to help figuring out the problem. So
what do you think? Does this help or should I start trying older kernels
(which is *hard* to do with latest libc and udev that I have)?
Thanks again,
Dimitris
Download attachment "config.gz" of type "application/x-gzip" (5793 bytes)
View attachment "dmesg.txt" of type "text/plain" (9092 bytes)
View attachment "oprof_idle.txt" of type "text/plain" (9502 bytes)
View attachment "oprof_one_disk.txt" of type "text/plain" (17721 bytes)
View attachment "oprof_two_disks.txt" of type "text/plain" (14877 bytes)
View attachment "oprof_two_disks_bad2.txt" of type "text/plain" (17366 bytes)
View attachment "vmstat_idle.txt" of type "text/plain" (944 bytes)
View attachment "vmstat_one_disk.txt" of type "text/plain" (936 bytes)
View attachment "vmstat_two_disks.txt" of type "text/plain" (936 bytes)
View attachment "vmstat_two_disks_bad2.txt" of type "text/plain" (937 bytes)
Powered by blists - more mailing lists