linux-kernel - Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS is under heavy write load (massive starvation)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0704281937110.14529@artax.karlin.mff.cuni.cz>
Date:	Sat, 28 Apr 2007 19:55:34 +0200 (CEST)
From:	Mikulas Patocka <mikulas@...ax.karlin.mff.cuni.cz>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Mike Galbraith <efault@....de>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [ext3][kernels >= 2.6.20.7 at least] KDE going comatose when FS
 is under heavy write load (massive starvation)

On Sat, 28 Apr 2007, Linus Torvalds wrote:

>> The main problem is that if the user extracts tar archive, tar eventually
>> blocks on writeback I/O --- O.K. But if bash attempts to write one page to
>> .bash_history file at the same time, it blocks too --- bad, the user is
>> annoyed.
>
> Right, but it's actually very unlikely. Think about it: the person who
> extracts the tar-archive is perhaps dirtying a thousand pages, while the
> .bash_history writeback is doing a single one. Which process do you think
> is going to hit the "oops, we went over the limit" case 99.9% of the time?

Both. See balance_dirty_pages --- you loop there if
global_page_state(NR_FILE_DIRTY) + global_page_state(NR_UNSTABLE_NFS) + 
global_page_state(NR_WRITEBACK) is over limit.

So tar gets there first, start writeback, blocks. Innocent process calling 
one small write() gets there too (while writeback has not yet finished), 
sees that the expression is over limit and blocks too.

Really, you go to ballance_dirty_pages with 1/8 probability, so small 
writers will block with that probability --- better than blocking always, 
but still annoying.

> The _really_ annoying problem is when you just have absolutely tons of
> memory dirty, and you start doing the writeback: if you saturate the IO
> queues totally, it simply doesn't matter _who_ starts the writeback,
> because anybody who needs to do any IO at all (not necessarily writing) is
> going to be blocked.

I saw this writeback problem on machine that had a lot of memory (1G), 
internal fast disk where the distribution was installed and very slow 
external SCSI disk (6MB/s or so). When I did heavy write on the external 
disk and writeback started, the computer almost completely locked up --- 
any process trying to write anything to the fast disk blocked until 
writeback on the slow disk finishes.
(that machine had some old RHEL kernel and it is not mine so I can't test 
new kernels on it --- but the above fragment of code shows that the 
problem still exists today)

Mikulas

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/