[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20120613142949.734818a8.akpm@linux-foundation.org>
Date: Wed, 13 Jun 2012 14:29:49 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: linux-kernel@...r.kernel.org, Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 2/2] msync: start async writeout when MS_ASYNC
On Thu, 31 May 2012 22:43:55 +0200
Paolo Bonzini <pbonzini@...hat.com> wrote:
> msync.c says that applications had better use fsync() or fadvise(FADV_DONTNEED)
> instead of MS_ASYNC. Both advices are really bad:
>
> * fsync() can be a replacement for MS_SYNC, not for MS_ASYNC;
>
> * fadvise(FADV_DONTNEED) invalidates the pages completely, which will make
> later accesses expensive.
>
> Having the possibility to schedule a writeback immediately is an advantage
> for the applications. They can do the same thing that fadvise does,
> but without the invalidation part. The implementation is also similar
> to fadvise, but with tag-and-write enabled.
>
> One example is if you are implementing a persistent dirty bitmap.
> Whenever you set bits to 1 you need to synchronize it with MS_SYNC, so
> that dirtiness is reported properly after a host crash. If you have set
> any bits to 0, getting them to disk is not needed for correctness, but
> it is still desirable to save some work after a host crash. You could
> simply use MS_SYNC in a separate thread, but MS_ASYNC provides exactly
> the desired semantics and is easily done in the kernel.
>
> If the application does not want to start I/O, it can simply call msync
> with flags equal to MS_INVALIDATE. This one remains a no-op, as it should
> be on a reasonable implementation.
Means that people will find that their msync(MS_ASYNC) call will newly
start IO. This may well be undesirable for some.
Also, it hardwires into the kernel behaviour which userspace itself
could have initiated, with sync_file_range(). ie: reduced flexibility.
Perhaps we can update the msync.c code comments to direct people to
sync_file_range()?
One wonders how msync() works with nonlinear mappings. I guess
"badly". I think this was all discussed when we merged
remap_file_pages() (what a mistake that was) and we decided "too hard".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists