[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 3 Mar 2014 09:36:58 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
Cc: lkml <linux-kernel@...r.kernel.org>,
Miklos Szeredi <miklos@...redi.hu>,
"Theodore T'so" <tytso@....edu>, Christoph Hellwig <hch@....de>,
Chris Mason <clm@...com>, Dave Chinner <david@...morbit.com>,
Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
"J. Bruce Fields" <bfields@...i.umich.edu>,
Yongzhi Pan <panyongzhi@...il.com>
Subject: Re: Update of file offset on write() etc. is non-atomic with I/O
Ok, sorry for the long delay, I was distracted (and hoping that Al
would come up with a patch).
Anyway, attached is the patch I think we should do for this issue. It
is fairly simple:
- it adds a "f_pos_mutex" to the "struct file".
- it adds a new FMODE_ATOMIC_POS flag to the file mode flags to mark
things that need atomic f_pos updates
- it makes the "struct fd" flags be two flags rather than one: the
second flag is for "unlock f_pos_mutex when done"
- it introduces "fd[get,put]_pos()" which gets the f_pos_mutex when required
- it makes read/write/lseek use that.
It's pretty damn straightforward, I think, and is minimally serializing.
Al, comments? Yongzhi Pan, this is pretty much untested, but it's
pretty simple and it does fix your test-case.
Linus
On Thu, Feb 20, 2014 at 9:14 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> Yes, I do think we violate POSIX here because of how we handle f_pos
> (the earlier thread from 2006 you point to talks about the "thread
> safe" part, the point here about the actual wording of "atomic" is a
> separate issue).
>
> Long long ago we used to just pass in the pointer to f_pos directly,
> and then the low-level write would update it all under the inode
> semaphore, and all was good.
>
> And then we ended up having tons of problems with non-regular files
> and drivers accessing f_pos and having nasty races with it because
> they did *not* have any locking (and very fundamentally didn't want
> any), and we broke the serialization of f_pos. We still do the *IO*
> atomically, but yes, the f_pos access itself is outside the lock.
>
> Ho humm.. Al, any ideas of how to fix this?
View attachment "patch.diff" of type "text/plain" (6243 bytes)
Powered by blists - more mailing lists