[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c22700a16db4a4f8ae9203efcaed27b@AcuMS.aculab.com>
Date: Thu, 23 Jul 2020 08:37:27 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Catalin Marinas' <catalin.marinas@....com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
linux-arch <linux-arch@...r.kernel.org>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: RE: [RFC] raw_copy_from_user() semantics
From: Catalin Marinas
> Sent: 22 July 2020 17:54
>
> On Wed, Jul 22, 2020 at 01:14:21PM +0000, David Laight wrote:
> > From: Catalin Marinas
> > > Sent: 22 July 2020 12:37
> > > On Sun, Jul 19, 2020 at 12:34:11PM -0700, Linus Torvalds wrote:
> > > > On Sun, Jul 19, 2020 at 12:28 PM Linus Torvalds
> > > > <torvalds@...ux-foundation.org> wrote:
> > > > > I think we should try to get rid of the exact semantics.
> > > >
> > > > Side note: I think one of the historical reasons for the exact
> > > > semantics was that we used to do things like the mount option copying
> > > > with a "copy_from_user()" iirc.
> > > >
> > > > And that could take a fault at the end of the stack etc, because
> > > > "copy_mount_options()" is nasty and doesn't get a size, and just
> > > > copies "up to 4kB" of data.
> > > >
> > > > It's a mistake in the interface, but it is what it is. But we've
> > > > always handled the inexact count there anyway by originally doing byte
> > > > accesses, and at some point you optimized it to just look at where
> > > > page boundaries might be..
> > >
> > > And we may have to change this again since, with arm64 MTE, the page
> > > boundary check is insufficient:
> > >
> > > https://lore.kernel.org/linux-fsdevel/20200715170844.30064-25-catalin.marinas@arm.com/
> > >
> > > While currently the fault path is unlikely to trigger, with MTE in user
> > > space it's a lot more likely since the buffer (e.g. a string) is
> > > normally less than 4K and the adjacent addresses would have a different
> > > colour.
> > >
> > > I looked (though briefly) into passing the copy_from_user() problem to
> > > filesystems that would presumably know better how much to copy. In most
> > > cases the options are string, so something like strncpy_from_user()
> > > would work. For mount options as binary blobs (IIUC btrfs) maybe the fs
> > > has a better way to figure out how much to copy.
> >
> > What about changing the mount code to loop calling get_user()
> > to read aligned words until failure?
> > Mount is fairly uncommon and the extra cost is probably small compared
> > to the rest of doing a mount.
>
> Before commit 12efec560274 ("saner copy_mount_options()"), it was using
> single-byte get_user(). That could have been optimised for aligned words
> reading but I don't really think it's worth the hassle. Since the source
> and destination don't have the same alignment and some architecture
> don't support unaligned accesses (for storing to the kernel buffer), it
> would just make this function unnecessarily complicated.
It could do aligned words if the user buffer is aligned (it will be
most of the time) and bytes otherwise.
Or just fallback to a byte loop if the full 4k read fails.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists