lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160914213457.GG2356@ZenIV.linux.org.uk>
Date:   Wed, 14 Sep 2016 22:34:58 +0100
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: [RFC] writev() semantics with invalid iovec in the middle

	Right now writev() with 3-iovec array that has unmapped address in
the second element and total length less than PAGE_SIZE will write the
first segment and stop at that.  Among other things, it guarantees the
short copy, and I would rather have it yeild 0-bytes write (and -EFAULT as
return value).

	All POSIX has to say about that is this (in 2.3 Error Numbers):

[EFAULT]
    Bad address. The system detected an invalid address in attempting to use
an argument of a call. The reliable detection of this error cannot be
guaranteed, and when not detected may result in the generation of a signal,
indicating an address violation, which is sent to the process.

Note that unmapped page in the middle of a range covered already can lead to
the same kind of short write  - i.e. if we have
	p = mmap(0, 3*4096, PROT_READ, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
	munmap(p + 4096, 4096);
	fd = open("/tmp/foo", O_CREAT|O_TRUNC|O_RDWR, 0777);
	write(fd, p + 2048, 8192);

write() will yield -EFAULT, not a 2Kb stored.  The same will happen with
	writev(fd, &(struct iovec){p + 2048, 8192}, 1);
BTW, adding lseek(fd, 2049, SEEK_SET); before that write (or writev) will
result in 2047 bytes being written by the latter.

IOW, we do not try to squeeze every byte that can be squeezed out of the
buffer; generally, an unmapped address anywhere in PAGE_SIZE worth of data
that would go into the same page-aligned chunk of destination can result in
short write cut at the beginning of that chunk.  iovec boundaries act
as barriers to short writes, mostly by accident.

Do we need to preserve that special treatment of iovec boundaries?  I would
really like to get rid of that - the current behaviour is an easy and reliable
way to trigger a short copy case in ->write_end() and those are fairly
brittle.  Sure, we still need to cope with them, and I think I've got all
instances in the current mainline fixed, but they are often suboptimal.

Objections?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ