[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200918131317.GH18920@quack2.suse.cz>
Date: Fri, 18 Sep 2020 15:13:17 +0200
From: Jan Kara <jack@...e.cz>
To: Mikulas Patocka <mpatocka@...hat.com>
Cc: Dan Williams <dan.j.williams@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Eric Sandeen <esandeen@...hat.com>,
Dave Chinner <dchinner@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: the "read" syscall sees partial effects of the "write" syscall
On Fri 18-09-20 08:25:28, Mikulas Patocka wrote:
> I'd like to ask about this problem: when we write to a file, the kernel
> takes the write inode lock. When we read from a file, no lock is taken -
> thus the read syscall can read data that are halfway modified by the write
> syscall.
>
> The standard specifies the effects of the write syscall are atomic - see
> this:
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_07
Yes, but no Linux filesystem (except for XFS AFAIK) follows the POSIX spec
in this regard. Mostly because the mixed read-write performance sucks when
you follow it (not that it would absolutely have to suck - you can use
clever locking with range locks but nobody does it currently). In practice,
the read-write atomicity works on Linux only on per-page basis for buffered
IO.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists