linux-kernel - Re: Performance regression in write() syscall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200902242002.37555.nickpiggin@yahoo.com.au>
Date:	Tue, 24 Feb 2009 20:02:36 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Salman Qazi <sqazi@...gle.com>, davem@...emloft.net,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: Performance regression in write() syscall

On Tuesday 24 February 2009 15:28:54 Linus Torvalds wrote:
> On Tue, 24 Feb 2009, Nick Piggin wrote:
> > What does unixbench's fstime test do? If it is just writing to the
> > pagecache, then this would be unexpected.
>
> Hmm. Not necessarily. Even just plain writes may be slowed down, since the
> nontemporal loads and stores are generally slower than the normal case. So

Well it does nontemporal stores only, ie. store into the pagecache without
doubling the cache footprint (source is possibly already in cache if we're
write(2)ing it anyway).

But yes I could see if it it fills up store queue and starts going
synchronous, then for sizes < CPU cache size, it will probably go slower.
On the other hand, for cached writes they still have to be sent back
to RAM at some point, but maybe that can be pipelined with other things.

> it does make some kind of sense to try to avoid the noncached versions for
> small writes - because small writes tend to be for temp-files.

I don't see the significance of a temp file. If the pagecache is truncated,
then the cachelines remain dirty and so you can't avoid an eventual store
back to RAM? 

> I don't know if PAGE_SIZE is the right thing to test, and I also don't
> know if this is necessarily the best place to test it in, but I don't
> think it's necessarily wrong to do something like this.

No, but I think it should be in arch code, and the "_nocache" suffix
should just be a hint to the architecture that the destination is not
so likely to be used.

> In fact, I think we might also just check alignment. Doing the nontemporal
> crud makes little sense for a non-8-byte-aligned destination, since it
> will have to do the alignment stores cached anyway - and mixing them
> across a cacheline is just crazy.

That is exactly the kind of thing that could probably be better optimised
in arch code IMO.

It would have been nice to have had some numbers to justify
0812a579c92fefa57506821fa08e90f47cb6dbdd in the first place, so you have
a point of reference to see what happens to your speed-up-case when you
change things like this. Sigh.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/