[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.00.0801231329120.2803@woody.linux-foundation.org>
Date: Wed, 23 Jan 2008 13:36:45 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Miklos Szeredi <miklos@...redi.hu>
cc: a.p.zijlstra@...llo.nl, salikhmetov@...il.com, linux-mm@...ck.org,
jakob@...hought.net, linux-kernel@...r.kernel.org,
valdis.kletnieks@...edu, riel@...hat.com, ksm@...dk,
staubach@...hat.com, jesper.juhl@...il.com,
akpm@...ux-foundation.org, protasnb@...il.com,
r.e.wolff@...wizard.nl, hidave.darkstar@...il.com,
hch@...radead.org
Subject: Re: [PATCH -v8 3/4] Enable the MS_ASYNC functionality in
sys_msync()
On Wed, 23 Jan 2008, Miklos Szeredi wrote:
>
> Yeah, nasty.
>
> How about doing it in a separate pass, similarly to
> wait_on_page_writeback()? Just instead of waiting, clean the page
> tables for writeback pages.
That sounds like a good idea, but it doesn't work.
The thing is, we need to hold the page-table lock over the whole sequence
of
if (page_mkclean(page))
set_page_dirty(page);
if (TestClearPageDirty(page))
..
and there's a big comment about why in clear_page_dirty_for_io().
So if you split it up, so that the first phase is that
if (page_mkclean(page))
set_page_dirty(page);
and the second phase is the one that just does a
if (TestClearPageDirty(page))
writeback(..)
and having dropped the page lock in between, then you lose: because
another thread migth have faulted in and re-dirtied the page table entry,
and you MUST NOT do that "TestClearPageDirty()" in that case!
That dirty bit handling is really really important, and it's sadly also
really really easy to get wrong (usually in ways that are hard to even
notice: things still work 99% of the time, and you might just be leaking
memory slowly, and fsync/msync() might not write back memory mapped data
to disk at all etc).
> Sure, I would have though all of this stuff is 2.6.25, but it's your
> kernel... :)
Well, the plain added "file_update_time()" call addition looked like a
trivial fix, and if there are actually *customers* that have bad backups
due to this, then I think that part was worth doing. At least a "sync"
will then sync the file times...
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists