lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 25 Feb 2009 14:23:57 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Salman Qazi <sqazi@...gle.com>, davem@...emloft.net,
	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: Performance regression in write() syscall

On Wednesday 25 February 2009 02:52:34 Linus Torvalds wrote:
> On Tue, 24 Feb 2009, Nick Piggin wrote:
> > > it does make some kind of sense to try to avoid the noncached versions
> > > for small writes - because small writes tend to be for temp-files.
> >
> > I don't see the significance of a temp file. If the pagecache is
> > truncated, then the cachelines remain dirty and so you can't avoid an
> > eventual store back to RAM?
>
> No, because many small files end up being used as scratch-pads (think
> shell script sequences etc), and get read back immediately again. Doing
> non-temporal stores might just be bad simply because trying to play games
> with caching may simply do the wrong thing.

OK, for that angle it could make sense. Although as has been noted earlier,
at this point of the copy, we don't have much idea about the length of the
write passed into the vfs (and obviously will never know the higher level
intention of userspace).

I don't know if we can say a 1 page write is nontemporal, but anything
smaller is temporal. And having these kinds of behavioural cutoffs I
would worry will create strange performance boundary conditions in code.


> > > I don't know if PAGE_SIZE is the right thing to test, and I also don't
> > > know if this is necessarily the best place to test it in, but I don't
> > > think it's necessarily wrong to do something like this.
> >
> > No, but I think it should be in arch code, and the "_nocache" suffix
> > should just be a hint to the architecture that the destination is not
> > so likely to be used.
>
> Yes. Especially since arch code is likely to need various arch-specific
> checks anyway (like the x86 code does about aligning the destination).
>
> > It would have been nice to have had some numbers to justify
> > 0812a579c92fefa57506821fa08e90f47cb6dbdd in the first place, so you have
> > a point of reference to see what happens to your speed-up-case when you
> > change things like this. Sigh.
>
> Well, there were no performance numbers for that commit, since it didn't
> actually tie it into anything, but I'm pretty sure we saw several
> performance numbers for the change.
>
> Yes, and they are in the commit logs. See "x86: cache pollution aware
> __copy_from_user_ll()", commit c22ce143d15eb288543fe9873e1c5ac1c01b69a1.
>
> But notice how that is iozone numbers. Very much about _big_ writes.

Yeah I see, thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ