linux-kernel - Re: Linux 2.6.29

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49C88C80.5010803@krogh.cc>
Date:	Tue, 24 Mar 2009 08:32:16 +0100
From:	Jesper Krogh <jesper@...gh.cc>
To:	David Rees <drees76@...il.com>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29

David Rees wrote:
> On Mon, Mar 23, 2009 at 11:19 PM, Jesper Krogh <jesper@...gh.cc> wrote:
>> I know this has been discussed before:
>>
>> [129401.996244] INFO: task updatedb.mlocat:31092 blocked for more than 480
>> seconds.
> 
> Ouch - 480 seconds, how much memory is in that machine, and how slow
> are the disks? 

The 480 secondes is not the "wait time" but the time gone before the
message is printed. It the kernel-default it was earlier 120 seconds but
thats changed by Ingo Molnar back in september. I do get a lot of less
noise but it really doesn't tell anything about the nature of the problem.

The systes spec:
32GB of memory. The disks are a Nexsan SataBeast with 42 SATA drives in 
Raid10 connected using 4Gbit fibre-channel. I'll let it up to you to 
decide if thats fast or slow?

The strange thing is actually that the above process (updatedb.mlocate) 
is writing to / which is a device without any activity at all. All 
activity is on the Fibre Channel device above, but process writing 
outsid that seems to be effected as well.

 > What's your vm.dirty_background_ratio and
> vm.dirty_ratio set to?

2.6.29-rc8 defaults:
jk@...t:/proc/sys/vm$ cat dirty_background_ratio
5
jk@...t:/proc/sys/vm$ cat dirty_ratio
10

>> Consensus seems to be something with large memory machines, lots of dirty
>> pages and a long writeout time due to ext3.
> 
> All filesystems seem to suffer from this issue to some degree.  I
> posted to the list earlier trying to see if there was anything that
> could be done to help my specific case.  I've got a system where if
> someone starts writing out a large file, it kills client NFS writes.
> Makes the system unusable:
> http://marc.info/?l=linux-kernel&m=123732127919368&w=2

Yes, I've hit 120s+ penalties just by saving a file in vim.

> Only workaround I've found is to reduce dirty_background_ratio and
> dirty_ratio to tiny levels.  Or throw good SSDs and/or a fast RAID
> array at it so that large writes complete faster.  Have you tried the
> new vm_dirty_bytes in 2.6.29?

No.. What would you suggest to be a reasonable setting for that?

 > Everyone seems to agree that "autotuning" it is the way to go.  But no
 > one seems willing to step up and try to do it.  Probably because it's
 > hard to get right!

I can test patches.. but I'm not a kernel-developer.. unfortunately.

Jesper

-- 
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/