linux-kernel - Re: [patch] x86, mm: pass in 'total' to __copy_from_user

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200902251909.42928.nickpiggin@yahoo.com.au>
Date:	Wed, 25 Feb 2009 19:09:42 +1100
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Salman Qazi <sqazi@...gle.com>, davem@...emloft.net,
	linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <andi@...stfloor.org>
Subject: Re: [patch] x86, mm: pass in 'total' to __copy_from_user_*nocache()

On Wednesday 25 February 2009 18:25:03 Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@...oo.com.au> wrote:
> > On Wednesday 25 February 2009 02:52:34 Linus Torvalds wrote:
> > > On Tue, 24 Feb 2009, Nick Piggin wrote:
> > > > > it does make some kind of sense to try to avoid the noncached
> > > > > versions for small writes - because small writes tend to be for
> > > > > temp-files.
> > > >
> > > > I don't see the significance of a temp file. If the pagecache is
> > > > truncated, then the cachelines remain dirty and so you can't avoid an
> > > > eventual store back to RAM?
> > >
> > > No, because many small files end up being used as scratch-pads (think
> > > shell script sequences etc), and get read back immediately again. Doing
> > > non-temporal stores might just be bad simply because trying to play
> > > games with caching may simply do the wrong thing.
> >
> > OK, for that angle it could make sense. Although as has been
> > noted earlier, at this point of the copy, we don't have much
> > idea about the length of the write passed into the vfs (and
> > obviously will never know the higher level intention of
> > userspace).
> >
> > I don't know if we can say a 1 page write is nontemporal, but
> > anything smaller is temporal. And having these kinds of
> > behavioural cutoffs I would worry will create strange
> > performance boundary conditions in code.
>
> I agree in principle.
>
> The main artifact would be the unaligned edges around a bigger
> write. In particular the tail portion of a big write will be
> cached.
>
> For example if we write a 100,000 bytes file, we'll copy the
> first 24 pages (98304 bytes) uncached, while the final 1696
> bytes cached. But there is nothing that necessiates this
> assymetry.
>
> For that reason it would be nice to pass down the total size of
> the write to the assembly code. These are single-usage-site APIs
> anyway so it should be easy.
>
> I.e. something like the patch below (untested). I've extended
> the copy APIs to also pass down a 'total' size as well, and
> check for that instead of the chunk 'len'. Note that it's
> checked in the inlined portion so all the "total == len" special
> cases will optimize out the new parameter completely.

This does give more information, not exactly all (it could be
a big total write with many smaller writes especially if the
source is generated on the fly and will be in cache, or if the
source is not in cache, then we would also want to do nontemporal
loads from there etc etc).


> This should express the 'large vs. small' question adequately,
> with no artifacts. Agreed?

Well, no artifacts, but it still has a boundary condition where
one might cut from temporal to nontemporal behaviour.

If it is a *really* important issue, maybe some flags should be
incorporated into an extended API? It not, then I wonder if it
is important enough to add such complexity for?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/