linux-kernel - Re: high system cpu load during intense disk i/o

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <46C62CF5.9080502@gmx.net>
Date:	Sat, 18 Aug 2007 01:19:17 +0200
From:	Dimitrios Apostolou <jimis@....net>
To:	Rafał Bilski <rafalbilski@...eria.pl>
CC:	linux-kernel@...r.kernel.org, Alan Cox <alan@...rguk.ukuu.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: high system cpu load during intense disk i/o

Hello list,

before trying to reproduce the problem with older kernels I did the 
necessary step of compiling and using a *vanilla* simple monolithic 
kernel for my measurements. The kernel config (attached config.gz) has 
many standard things disabled (like ACPI for example) so the oprofile 
output now seems very different. Please keep in mind that I switched 
back from libata to the old IDE driver, to be able to use the same 
config on old kernels.

The situations I attach are:

   idle:		The PC doing nothing. Note that now idle time is spent in 
irq_handler and not in poll_idle. Strange...
   one_disk:	Destructive badblocks (badblocks -v -w) on one disk. 
Everything is responsive and the CPU is 99% iowait as it should.
   two_disks:	Destructive badblocks on two disks before the problem 
appears. Things are starting to get sluggy.
   two_disks_bad2:	*PROBLEM* The previous situation after several 
minutes, and after several cron jobs kicked in (and never finished). 
System in a bad state, highly unresponsive.

The situation seems now completely different (but practically the 
problem is exactly the same), probably because of kernel options. Here 
are the first lines from opreport with debugging info, for the 
two_disks_bad2 scenario:

CPU: PIII, speed 798.02 MHz (estimated)
Counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a 
unit mask of 0x00 (No unit mask) count 100000
samples  %        linenr info                 symbol name
282       9.3377  ide-iops.c:1081             pre_reset
231       7.6490  stats.c:187                 rpc_print_iostats
222       7.3510  ptrace.c:654                do_syscall_trace
146       4.8344  ide-io.c:1185               ide_do_request
144       4.7682  process.c:529               dump_task_regs
131       4.3377  stats.c:64                  rpc_proc_open
122       4.0397  backing-dev.c:46            congestion_wait
98        3.2450  vsprintf.c:622              vsscanf
52        1.7219  process.c:643               __switch_to
33        1.0927  sched.c:4065                interruptible_sleep_on
32        1.0596  slub.c:597                  check_object
32        1.0596  signal.c:244                setup_sigcontext
32        1.0596  signal.c:56                 sys_sigaction
31        1.0265  buffer.c:2452               block_truncate_page
31        1.0265  fadvise.c:28                sys_fadvise64_64
31        1.0265  page-writeback.c:987        test_set_page_writeback

If you think I should enable/disable other options in the kernel please 
tell me. Moreover it would be nice to know how to use the various 
debugging options that I enabled, to help figuring out the problem. So 
what do you think? Does this help or should I start trying older kernels 
(which is *hard* to do with latest libc and udev that I have)?

Thanks again,
Dimitris

Download attachment "config.gz" of type "application/x-gzip" (5793 bytes)

View attachment "dmesg.txt" of type "text/plain" (9092 bytes)

View attachment "oprof_idle.txt" of type "text/plain" (9502 bytes)

View attachment "oprof_one_disk.txt" of type "text/plain" (17721 bytes)

View attachment "oprof_two_disks.txt" of type "text/plain" (14877 bytes)

View attachment "oprof_two_disks_bad2.txt" of type "text/plain" (17366 bytes)

View attachment "vmstat_idle.txt" of type "text/plain" (944 bytes)

View attachment "vmstat_one_disk.txt" of type "text/plain" (936 bytes)

View attachment "vmstat_two_disks.txt" of type "text/plain" (936 bytes)

View attachment "vmstat_two_disks_bad2.txt" of type "text/plain" (937 bytes)