lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 24 Feb 2009 07:52:34 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Nick Piggin <nickpiggin@...oo.com.au>
cc:	Salman Qazi <sqazi@...gle.com>, davem@...emloft.net,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: Performance regression in write() syscall



On Tue, 24 Feb 2009, Nick Piggin wrote:
>
> > it does make some kind of sense to try to avoid the noncached versions for
> > small writes - because small writes tend to be for temp-files.
> 
> I don't see the significance of a temp file. If the pagecache is truncated,
> then the cachelines remain dirty and so you can't avoid an eventual store
> back to RAM? 

No, because many small files end up being used as scratch-pads (think 
shell script sequences etc), and get read back immediately again. Doing 
non-temporal stores might just be bad simply because trying to play games 
with caching may simply do the wrong thing.


> > I don't know if PAGE_SIZE is the right thing to test, and I also don't
> > know if this is necessarily the best place to test it in, but I don't
> > think it's necessarily wrong to do something like this.
> 
> No, but I think it should be in arch code, and the "_nocache" suffix
> should just be a hint to the architecture that the destination is not
> so likely to be used.

Yes. Especially since arch code is likely to need various arch-specific 
checks anyway (like the x86 code does about aligning the destination).

> It would have been nice to have had some numbers to justify
> 0812a579c92fefa57506821fa08e90f47cb6dbdd in the first place, so you have
> a point of reference to see what happens to your speed-up-case when you
> change things like this. Sigh.

Well, there were no performance numbers for that commit, since it didn't 
actually tie it into anything, but I'm pretty sure we saw several 
performance numbers for the change.

Yes, and they are in the commit logs. See "x86: cache pollution aware 
__copy_from_user_ll()", commit c22ce143d15eb288543fe9873e1c5ac1c01b69a1.

But notice how that is iozone numbers. Very much about _big_ writes.

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ