lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 2 Apr 2009 15:42:51 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jeff Garzik <jeff@...zik.org>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	David Rees <drees76@...il.com>, Janne Grunau <j@...nau.net>,
	Lennart Sorensen <lsorense@...lub.uwaterloo.ca>,
	Theodore Tso <tytso@....edu>, Jesper Krogh <jesper@...gh.cc>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29



On Thu, 2 Apr 2009, Jeff Garzik wrote:
> 
> Dumb VM question, then:  I understand the logic behind the write-throttling
> part (some of my own userland code does something similar), but,
> 
> Does this imply adding fadvise to your overwrite.c example is (a) not
> noticable, (b) potentially less efficient, (c) potentially more efficient?

For _that_ particular load it was more of a "it wasn't the issue". I 
wanted to get timely writeouts, because otherwise they bunch up and become 
unmanageable (with even the people who are not actually writing end up 
waiting for the writeouts). 

Once the pages are clean, it just didn't matter. The VM did the balancing 
right enough that I stopped caring. With other access patterns (ie if the 
pages ended up on the active list) the situation might have been 
different.

> Or IOW, does fadvise purely put pages on the cold list as your 
> sync_file_range incantation does, or something different?

sync_file_range() doesn't actually put the pages on the inactive list, but 
since the program was just a streaming one, they never even left it.

But no, fadvise actually tries to actually invalidate the pages (ie gets 
rid of them, as opposed to moving them to the inactive list).

Another note: I literally used that program just for whole-disk testing, 
so the behavior on an actual filesystem may or may not match. But I just 
tested on ext3 on my desktop, and got

     1.734 GB written in 30.38 (58 MB/s)           

until I ^C'd it, and I didn't have any sound skipping or anything like 
that. Of course, that's with those nice Intel SSD's, so that doesn't 
really say anything.

Feel free to give it a try. It _should_ maintain good write speed while 
not disturbing the system much. But I bet if you added the "fadvise()" it 
would disturb things even _less_.

My only point is really that you _can_ do streaming writes well, but at 
the same time I do think the kernel makes it too hard to do it with 
"simple" applications. I'd love to get the same kind of high-speed 
streaming behavior by just doing a simple "dd if=/dev/zero of=bigfile"

And I really think we should be able to.

And no, we clearly are _not_ able to do that now. I just tried with "dd", 
and created a 1.7G file that way, and it was stuttering - even with my 
nice SSD setup. I'm in my MUA writing this email (obviously), and in the 
middle it just totally hung for about half a minute - because it was 
obviously doing some fsync() for temporary saving etc while the "sync" was 
going on.

With the "overwrite.c" thing, I do get short pauses when my MUA does 
something, but they are not the kind of "oops, everything hung for several 
seconds" kind. 

(Full disclosure: 'alpine' with the local mbox on one disk - I _think_ 
that what alpine does is fsync() temporary save-files, but it might also 
be checking email in the background - I have not looked at _why_ alpine 
does an fsync, but it definitely does. And 5+ second delays are very 
annoying when writing emails - much less half a minute).

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ