linux-kernel - Re: copy_from_user_*() and buffer zeroing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <529529D8.1090803@zytor.com>
Date:	Tue, 26 Nov 2013 15:08:08 -0800
From:	"H. Peter Anvin" <hpa@...or.com>
To:	NeilBrown <neilb@...e.de>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Al Viro <viro@...IV.linux.org.uk>,
	Thomas Gleixner <tglx@...utronix.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Vitaly Mayatskikh <v.mayatskih@...il.com>,
	"Murty, Ravi" <ravi.murty@...el.com>,
	Nick Piggin <npiggin@...il.com>
Subject: Re: copy_from_user_*() and buffer zeroing

On 11/26/2013 03:04 PM, NeilBrown wrote:
> On Tue, 26 Nov 2013 14:28:59 -0800 "H. Peter Anvin" <hpa@...or.com>
> wrote:
> 
>> On 11/26/2013 01:54 PM, Andrew Morton wrote:
>>> 
>>> Nine years ago:
>>> 
>>> commit 7079f897164cb14f616c785d3d01629fd6a97719 Author: mingo
>>> <mingo> Date:   Fri Aug 27 17:33:18 2004 +0000
>>> 
>>> [PATCH] Add a few might_sleep() checks
>>> 
>>> Add a whole bunch more might_sleep() checks.  We also enable
>>> might_sleep() checking in copy_*_user().  This was non-trivial
>>> because of the "copy_*_user() in atomic regions" trick would
>>> generate false positives.  Fix that up by adding a new
>>> __copy_*_user_inatomic(), which avoids the might_sleep()
>>> check.
>>> 
>>> Only i386 is supported in this patch.
>>> 
>>> 
>>> I can't think of any reason why __copy_from_user_inatomic()
>>> should be non-zeroing.  But maybe I'm missing something - this
>>> would pretty easily permit uninitialised data to appear in
>>> pagecache and someone surely would have noticed..
>>> 
>> 
>> Yes, and the might_sleep() check is indeed bypassed.
>> 
>> However, the non-zeroing bit is currently motivated by the
>> following comment:
>> 
>> /** * __copy_from_user: - Copy a block of data from user space,
>> with less checking. * @to:   Destination address, in kernel
>> space. * @from: Source address, in user space. * @n:    Number of
>> bytes to copy. * * Context: User context only.  This function may
>> sleep. * * Copy data from user space to kernel space.  Caller
>> must check * the specified block with access_ok() before calling
>> this function. * * Returns number of bytes that could not be
>> copied. * On success, this will be zero. * * If some data could
>> not be copied, this function will pad the copied * data to the
>> requested size using zero bytes. * * An alternate version -
>> __copy_from_user_inatomic() - may be called from * atomic context
>> and will fail rather than sleep.  In this case the * uncopied
>> bytes will *NOT* be padded with zeros.  See fs/filemap.h * for
>> explanation of why this is needed. */
>> 
>> This comment is only present in the 32-bit code.  fs/filemap.h of
>> course no longer exists, however, the original commit seems to
>> be 01408c4939479ec46c15aa7ef6e2406be50eeeca which puts a comment
>> in the (now defunct mm/filemap.h).
>> 
>> I have to say I don't follow the explanation in that patch.  It
>> seems like if you're concurrently reading a buffer being written
>> you should expect to get any kind of mismash...
>> 
>> Neil, is this still an issue?
>> 
> 
> I can't be certain if this is "still" and issue as many things
> could have changed and I haven't been following them. I can try to
> explain the original issue though.
> 
> If a process tries to read a file while another process is writing
> to the same page of the same file, then it is quite reasonable for
> the reader to see almost any combination of the old and the new
> data.  However it is wrong for it so see something else.  In
> particular if the file actually contains no nuls, and the writer
> doesn't write any nuls, then the read should not see any nuls.
> 
> At the time of this patch, that could happen. If the page contains
> valid data it will not be locked, and a read can succeed at any
> time without further locking. When writing to a page,
> filemap_copy_from_user would first try an atomic copy and if that
> failed, it could write zeros into the page, which would then be 
> over-written by a subsequent non-atomic copy. This leaves a small
> window where zeros can be seen in the page by a read (or a
> memory-mapping).
> 
> A quick look at the code history shows that Nick Piggin removed the
> comment from mm/filemap.h in commit  4a9e5ef1f4f15205e477817a5 and
> it looks like the code was changed so it doesn't "try one way, then
> try another".  So it could well be that the failure mode that
> caused the problem before is no longer a possible failure mode. And
> if that failure mode is no longer possible, then maybe
> copy_from_user will never fail and so never has a need to fill with
> zeros??
> 

Nick, could you perhaps comment on this?

> The reason only i386 was changed it that it was the only arch were 
> copy_from_user_atomic might ever zero a tail.  Most arch just used
> memcpy or similar.  powerpc is the only other arch that defined a
> non-trivial copy_from_user_atomic and I confirmed at the time that
> it would never (need to) zero a tail.

Well, there are several that do now...

	-hpa




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/