[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <465A7894.5090101@yahoo.com.au>
Date: Mon, 28 May 2007 16:37:08 +1000
From: Nick Piggin <nickpiggin@...oo.com.au>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] Preserve the dirty bit in init_page_buffers
Eric W. Biederman wrote:
> Nick Piggin <nickpiggin@...oo.com.au> writes:
>
>
>>Eric W. Biederman wrote:
>>
>>>The problem: When we are trying to free buffers try_to_free_buffers
>>>will look at ramdisk pages with clean buffer heads and remove the
>>>dirty bit from the page. Resulting in ramdisk pages with data that
>>>get removed from the page cache. Ouch!
>>>
>>>Buffer heads appear on ramdisk pages when a filesystem calls getblk,
>>>which through a series of function calls eventually calls
>>>init_page_buffers.
>>>
>>>So to fix the mismatch between buffer head state and page state this
>>>patch modifies init_page_buffers to transfer the dirty bit from the
>>>page to the buffer heads like we currently do for the uptodate bit.
>>
>>Ouch indeed!
>>
>>But can we ever have a dirty page at init_page_buffers-time?
>
>
> Definitely, and it was a royal pain to trace the bug that this
> caused. An initial ramdisk having pieces disappear after mkfs
> is called can look like the entire machine is dying.
>
> When we initialize the ramdisk by writing to /dev/ram0 usually in
> init/do_mounts_rd.c we don't allocate buffer heads but we do set
> the dirty bit, and the page is in the page cache. So when we
> later call getblk it reuses the same page and then calls
> init_page_buffers.
Hmm, the comment above grow_dev_buffers indicates this should
not happen. But contrary to the comment, it doesn't go BUG
unless you're attaching dirty buffers to a page with dirty
buffers.
I suspect this happens more frequently with rd.c, because unlike
block_dev.c, it does not create dirty buffers in prepare_write.
However it could still happen in block_dev.c via mmaped memory,
as I said earlier.
I'm not saying this patch 1/3 is wrong, but we would at least
need to revise some comments. The grow_dev_buffers comment looks
like one of Andrew's. Maybe he can shed some more light on this?
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists