lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 21 Feb 2014 07:01:31 +0100
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Al Viro <viro@...iv.linux.org.uk>
Cc:	"Zuckerman, Boris" <borisz@...com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Miklos Szeredi <miklos@...redi.hu>,
	"Theodore T'so" <tytso@....edu>, Christoph Hellwig <hch@....de>,
	Chris Mason <clm@...com>, Dave Chinner <david@...morbit.com>,
	Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
	"J. Bruce Fields" <bfields@...i.umich.edu>,
	Yongzhi Pan <panyongzhi@...il.com>
Subject: Re: Update of file offset on write() etc. is non-atomic with I/O

On Thu, Feb 20, 2014 at 7:29 PM, Al Viro <viro@...iv.linux.org.uk> wrote:
> On Thu, Feb 20, 2014 at 06:15:15PM +0000, Zuckerman, Boris wrote:
>> Hi,
>>
>> You probably already considered that - sorry, if so...
>>
>> Instead of the mutex Windows use ExecutiveResource with shared and exclusive semantics. Readers serialize by taking the resource shared and writers take it exclusive. I have that implemented for Linux. Please, let me know if there is any interest!
>
> See include/linux/rwsem.h...
>
> Anyway, the really interesting question here is what does POSIX promise
> wrt lseek() vs. write().  What warranties are given there?

I suppose you are wondering about cases such as:

Process A                     Process B
write():                      lseek()
    perform I/O
                              update f_pos
    update f_pos()

In my reading of POSIX, lseeek() and write() should be atomic w.r.t.
each other, and the above should not be allowed.

Here's the fulll list from POSIX.1-2008/SUSv4 Section XSI 2.9.7:

[[
2.9.7 Thread Interactions with Regular File Operations

All of the following functions shall be atomic with respect to each
other in the effects specified in
POSIX.1-2008 when they operate on regular files or symbolic links:

chmod( )
chown( )
close( )
creat( )
dup2( )
fchmod( )
fchmodat( )
fchown( )
fchownat( )
fcntl( )
fstat( )
fstatat( )
ftruncate( )
lchown( )
link( )
linkat( )
lseek( )
lstat( )
open( )
openat( )
pread( )
read( )
readlink( )
readlinkat( )
readv( )
pwrite( )
rename( )
renameat( )
stat( )
symlink( )
symlinkat( )
truncate( )
unlink( )
unlinkat( )
utime( )
utimensat( )
utimes( )
write( )
writev( )

If two threads each call one of these functions, each call shall
either see all of the specified effects
of the other call, or none of them.
]]

I'd bet that there's a bunch of violations to be found, but the
read/write f_pos case is one of the most egregious.

For example, I got curious about stat() versus rename(). If one
stat()s a directory() while a subdirectory is being renamed to a new
name within that directory, does the link count of the parent
directory ever change--that is, could stat() ever see a changed link
count in the middle of the rename()? My experiments suggest that it
can. I suppose it would have to be a very unusual application that
would be troubled by that, but it does appear to be a violation of
2.9.7.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ